前言
现如今,视频!每个人饭后或睡前都会刷一刷,查看每日新奇或小姐姐帅哥、明星视频~
而我们今天采集得是一个开眼界、涨知识的视频 App,
作为国内领先的中视频平台,它源源不断地为不同人群提供优质内容,
让人们看到更丰富和有深度的世界,收获轻松的获得感,点亮对生活的好奇心。
准备工作
第三方模块:
- requests >>> pip install requests
环境介绍:
- python 3.8 解释器
- pycharm 编辑器
- ffmpeg 音视频合成软件
代码实现:
- 发送请求 (访问网站)
- 获取数据
- 解析数据 (base64解码)
- 保存数据 (视频 音频)
- 音视频合并
代码
导入模块
import requests # 发送请求 第三方 import re # 内置模块 import json import base64 import subprocess import os
源码、教程、解答、资料点击领取
headers = { \'cookie\': \'MONITOR_WEB_ID=c27b9f4a-4917-4256-be93-e948308467e3; ttcid=0cbb8baca16443e8b2320dfcb0ebd3ab24; __gads=ID=b750d35ceb3b300e-22f59bfba5d0002a:T=1645008733:RT=1645008733:S=ALNI_MZSPYii3eywVYfjuGdExhE-Dw3tLw; BD_REF=1; support_webp=true; support_avif=true; _tea_utm_cache_1300=undefined; s_v_web_id=verify_l2kdgr6l_ZlYcneu1_fb24_4lQM_A1cp_pBZKlKxvJKzJ; passport_csrf_token=7e1f1777c680a1dd9f163d6916212e62; passport_csrf_token_default=7e1f1777c680a1dd9f163d6916212e62; sid_guard=880626da6250e5535bcc3b35a5804a5c%7C1651232961%7C3023999%7CFri%2C+03-Jun-2022+11%3A49%3A20+GMT; uid_tt=d87f79c88dc25ca91c644549863616c8; uid_tt_ss=d87f79c88dc25ca91c644549863616c8; sid_tt=880626da6250e5535bcc3b35a5804a5c; sessionid=880626da6250e5535bcc3b35a5804a5c; sessionid_ss=880626da6250e5535bcc3b35a5804a5c; sid_ucp_v1=1.0.0-KGE4ZTdhODI0MjQ3Y2IyY2Y2ZmQwYjkzYTFhNDljYjdjYjdhM2U3OTgKFAjo5IrYFxDBoa-TBhgYIAw4CEAFGgJsZiIgODgwNjI2ZGE2MjUwZTU1MzViY2MzYjM1YTU4MDRhNWM; ssid_ucp_v1=1.0.0-KGE4ZTdhODI0MjQ3Y2IyY2Y2ZmQwYjkzYTFhNDljYjdjYjdhM2U3OTgKFAjo5IrYFxDBoa-TBhgYIAw4CEAFGgJsZiIgODgwNjI2ZGE2MjUwZTU1MzViY2MzYjM1YTU4MDRhNWM; odin_tt=ab7eaf992f0e5cc3871fd8fde7797f8253548498d52cd8f6320c1d408d8fb5f853f6b88fe9d3e249e91b0baac908955a; tt_scid=yZBs23biytSrdLbhg4PwtQsnp5iRak5-8X3Y.rM36zEzqMDW4OWKwf0CAfb4Sa8r725a; ttwid=1%7Cbki1kBY9AbTODWRF62oQmAFNNd1E9JpOrWrMnRcIdwY%7C1651234433%7C69cbf75423181a837f3739e9b73665b4dc82f1070d93934d5843d3ece167b776; __ac_nonce=0626bd85f00123bbca353; __ac_signature=_02B4Z6wo00f010qt8RAAAIDCKacxeDkkRtdKifWAALDLGZ5UTxtgNht0fiirvQ84GFg6fgEpzmKoOpzBna11K-91eblu7vLsme2e9DrawirS.iQkhzxxQA-2FbYMTkKz.zBC6phs4yeOUKGUc6; ixigua-a-s=3; msToken=wDc7U1VNr5xcJOObHh92pRLYNHcJkoa27rC9g9KpqtmyPZRHrp8KwNXRK82rkr2w-XEzqsGab7i_YSSrqQLCbvxl9etcaF4ElWGCvfE9-94Wyw4v8Fuq-LcizatEUIE=\', \'user-agent\': \'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.41 Safari/537.36\' } url = \'https://www.ixigua.com/7090467065097617931\' 源码、解答、资料加Q群:832157862免费领取哦~
1. 发送请求 (访问网站)
response = requests.get(url, headers=headers) # 乱码 response.encoding = \'utf-8\' # <Response [200]>: 访问成功
2. 获取数据
html_data = response.text
3. 解析数据 (base64解码)
# _SSR_HYDRATED_DATA=(.*?)</script> # (.*?): 匹配任何字符 换行符除外 json_str = re.findall(\'_SSR_HYDRATED_DATA=(.*?)</script>\', html_data)[0] # undefined 替换为 null json_str = json_str.replace(\'undefined\', \'null\') json_dict = json.loads(json_str) title = json_dict[\'anyVideo\'][\'gidInformation\'][\'packerData\'][\'video\'][\'title\'] title = title.replace(\' \', \'\') video_url = json_dict[\'anyVideo\'][\'gidInformation\'][\'packerData\'][\'video\'][\'videoResource\'][\'dash\'][\'dynamic_video\'][\'dynamic_video_list\'][-1][\'main_url\'] audio_url = json_dict[\'anyVideo\'][\'gidInformation\'][\'packerData\'][\'video\'][\'videoResource\'][\'dash\'][\'dynamic_video\'][\'dynamic_audio_list\'][-1][\'main_url\'] video_url = base64.b64decode(video_url) audio_url = base64.b64decode(audio_url) video_url = video_url.decode() audio_url = audio_url.decode()
4. 保存数据 (视频 音频)
源码、解答、资料加Q群:832157862免费领取哦~ video_data = requests.get(video_url).content with open(f\'{title}.mp4\', mode=\'wb\') as f: f.write(video_data) audio_data = requests.get(audio_url).content with open(f\'{title}.mp3\', mode=\'wb\') as f: f.write(audio_data) ffmpeg = r\'ffmpeg -i \' + title + \'.mp4 -i \' + title + \'.mp3 -acodec copy -vcodec copy \' + title + \'-out.mp4\' subprocess.run(ffmpeg) os.remove(f\'{title}.mp3\') os.remove(f\'{title}.mp4\')
尾语
好了,我的这篇文章写到这里就结束啦!
有更多建议或问题可以评论区或私信我哦!一起加油努力叭(ง •_•)ง
喜欢就关注一下博主,或点赞收藏评论一下我的文章叭!!!
来源:https://www.cnblogs.com/jnjnj/p/16258755.html
本站部分图文来源于网络,如有侵权请联系删除。