前言 😋
嗨喽,大家好呀~这里是爱看美女的茜茜呐
技术赋能,用科技提升每个人独特的幸福感。
在快上,用户可以用照片和短视频记录自己的生活点滴,也可以通过直播与粉丝实时互动。
快的内容覆盖生活的方方面面,用户遍布全国各地。
在这里,人们能找到自己喜欢的内容,找到自己感兴趣的人,看到更真实有趣的世界,也可以让世界发现真实有趣的自己。
知识点:
-
动态数据抓包
-
requests发送请求
-
json数据解析
开发环境:
-
python 3.8 运行代码
-
pycharm 2021.2 辅助敲代码
-
requests 第三方模块 发送请求 Python工具 访问网站
代码实现:
-
发送请求
-
获取数据
-
解析数据
-
保存数据
采集视频代码
导入模块
import requests # 第三方模块 发送请求
import re
伪装
headers = { \'content-type\': \'application/json\', \'Cookie\': \'kpf=PC_WEB; kpn=KUAISHOU_VISION; clientid=3; did=web_d3f9d8c2cbebafd126b80eb0b1c13360; client_key=65890b29; didv=1658130458000; userId=270932146; kuaishou.server.web_st=ChZrdWFpc2hvdS5zZXJ2ZXIud2ViLnN0EqABymzXlGDinYWz3v5NKZWKq6Ld14uOvyRNPT3Gi7uJwI8CE4aatjowKRbPtRt5YIE3s2otZdFEzL7kvW1PQuijqUT_qUe4-u0FlfN1S49mhR4QRc9YKQNObXAPYzZRWIRcrSvdohIwUW8TBTSWLUtMlMh2He2FyvNMR-JfhUHaK-YSkwqXKUj-N-zlHTCPp0z0y6cSgrR9RIdlXqIJFifSbxoSsguEA2pmac6i3oLJsA9rNwKEIiB86mXKYIgbGBbtkVuyoy8TCIwZ2uckiTnfAGZiyV9imCgFMAE; kuaishou.server.web_ph=7353170c91b8f7f05c250730c2faea5355e1\', \'Host\': \'www.kuaishou.com\', \'Origin\': \'https://www.kuaishou.com\', \'Referer\': \'https://www.kuaishou.com/search/video?searchKey=%E6%B3%B3%E8%A3%85%E5%B0%8F%E5%A7%90%E5%A7%90\', \'User-Agent\': \'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36\' } for page in range(1, 11): # post请求里面才会有 json = { \'operationName\': \"visionSearchPhoto\", \'query\': \"fragment photoContent on PhotoEntity {\\n id\\n duration\\n caption\\n likeCount\\n viewCount\\n realLikeCount\\n coverUrl\\n photoUrl\\n photoH265Url\\n manifest\\n manifestH265\\n videoResource\\n coverUrls {\\n url\\n __typename\\n }\\n timestamp\\n expTag\\n animatedCoverUrl\\n distance\\n videoRatio\\n liked\\n stereoType\\n profileUserTopPhoto\\n __typename\\n}\\n\\nfragment feedContent on Feed {\\n type\\n author {\\n id\\n name\\n headerUrl\\n following\\n headerUrls {\\n url\\n __typename\\n }\\n __typename\\n }\\n photo {\\n ...photoContent\\n __typename\\n }\\n canAddComment\\n llsid\\n status\\n currentPcursor\\n __typename\\n}\\n\\nquery visionSearchPhoto($keyword: String, $pcursor: String, $searchSessionId: String, $page: String, $webPageArea: String) {\\n visionSearchPhoto(keyword: $keyword, pcursor: $pcursor, searchSessionId: $searchSessionId, page: $page, webPageArea: $webPageArea) {\\n result\\n llsid\\n webPageArea\\n feeds {\\n ...feedContent\\n __typename\\n }\\n searchSessionId\\n pcursor\\n aladdinBanner {\\n imgUrl\\n link\\n __typename\\n }\\n __typename\\n }\\n}\\n\", \'variables\': {\'keyword\': \"泳装小姐姐\", \'pcursor\': str(page), \'page\': \"search\", \'searchSessionId\': \"MTRfMjcwOTMyMTQ2XzE2NTg5MjM5NDExODBf5rOz6KOF5bCP5aeQ5aeQXzE4NzQ\"} } url = \'https://www.kuaishou.com/graphql\'
1. 发送请求
response = requests.post(url=url, headers=headers, json=json)
<Response [200]>:
请求成功
<Response [400]>:
没有在服务器里面找到你想要的资源
给不给你数据 是两回事
2. 获取数据
.text:
字符串
.json():
字典类型数据
json_data = response.json()
3. 解析数据
xpath
css
只能取网页源代码里面数据的
re
如果当 xpath
和 css
和 json
都不可以用的时候 都可以取 (复杂)
json
只能取 {\"\":\"\"} [\"\", \"\"]
feeds = json_data[\'data\'][\'visionSearchPhoto\'][\'feeds\'] for i in range(0, len(feeds)): photoUrl = feeds[i][\'photo\'][\'photoUrl\'] caption = feeds[i][\'photo\'][\'caption\'] print(caption, photoUrl) caption = re.sub(\'[\\\\/:*?\"<>|\\\\n]\', \'\', caption)
4. 保存视频
一般情况下, 大部分网站 视频链接 图片链接 音频链接 都可以直接用get
.content:
获取视频二进制数据
video_data = requests.get(photoUrl).content with open(f\'video/{caption}.mp4\', mode=\'wb\') as f: f.write(video_data)
源码、解答、教程加Q裙:261823976 点击蓝字加入【python学习裙】
自动评论, 自动点赞
import requests class KuaiShou(): def __init__(self): self.headers = { \'content-type\': \'application/json\', \'Cookie\': \'kpf=PC_WEB; kpn=KUAISHOU_VISION; clientid=3; did=web_d3f9d8c2cbebafd126b80eb0b1c13360; client_key=65890b29; didv=1658130458000; userId=270932146; kuaishou.server.web_st=ChZrdWFpc2hvdS5zZXJ2ZXIud2ViLnN0EqABymzXlGDinYWz3v5NKZWKq6Ld14uOvyRNPT3Gi7uJwI8CE4aatjowKRbPtRt5YIE3s2otZdFEzL7kvW1PQuijqUT_qUe4-u0FlfN1S49mhR4QRc9YKQNObXAPYzZRWIRcrSvdohIwUW8TBTSWLUtMlMh2He2FyvNMR-JfhUHaK-YSkwqXKUj-N-zlHTCPp0z0y6cSgrR9RIdlXqIJFifSbxoSsguEA2pmac6i3oLJsA9rNwKEIiB86mXKYIgbGBbtkVuyoy8TCIwZ2uckiTnfAGZiyV9imCgFMAE; kuaishou.server.web_ph=7353170c91b8f7f05c250730c2faea5355e1\', \'Host\': \'www.kuaishou.com\', \'Origin\': \'https://www.kuaishou.com\', \'Referer\': \'https://www.kuaishou.com/search/video?searchKey=%E6%B3%B3%E8%A3%85%E5%B0%8F%E5%A7%90%E5%A7%90\', \'User-Agent\': \'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36\' } self.url = \'https://www.kuaishou.com/graphql\' def getSearch(self, keyword, page): \"\"\" 获取搜索视频 :param keyword: 关键字 :param page: 页码 :return: json_data \"\"\" json = { \'operationName\': \"visionSearchPhoto\", \'query\': \"fragment photoContent on PhotoEntity {\\n id\\n duration\\n caption\\n likeCount\\n viewCount\\n realLikeCount\\n coverUrl\\n photoUrl\\n photoH265Url\\n manifest\\n manifestH265\\n videoResource\\n coverUrls {\\n url\\n __typename\\n }\\n timestamp\\n expTag\\n animatedCoverUrl\\n distance\\n videoRatio\\n liked\\n stereoType\\n profileUserTopPhoto\\n __typename\\n}\\n\\nfragment feedContent on Feed {\\n type\\n author {\\n id\\n name\\n headerUrl\\n following\\n headerUrls {\\n url\\n __typename\\n }\\n __typename\\n }\\n photo {\\n ...photoContent\\n __typename\\n }\\n canAddComment\\n llsid\\n status\\n currentPcursor\\n __typename\\n}\\n\\nquery visionSearchPhoto($keyword: String, $pcursor: String, $searchSessionId: String, $page: String, $webPageArea: String) {\\n visionSearchPhoto(keyword: $keyword, pcursor: $pcursor, searchSessionId: $searchSessionId, page: $page, webPageArea: $webPageArea) {\\n result\\n llsid\\n webPageArea\\n feeds {\\n ...feedContent\\n __typename\\n }\\n searchSessionId\\n pcursor\\n aladdinBanner {\\n imgUrl\\n link\\n __typename\\n }\\n __typename\\n }\\n}\\n\", \'variables\': {\'keyword\': keyword, \'pcursor\': str(page), \'page\': \"search\", \'searchSessionId\': \"MTRfMjcwOTMyMTQ2XzE2NTg5MjM5NDExODBf5rOz6KOF5bCP5aeQ5aeQXzE4NzQ\"} } json_data = requests.post(url=self.url, headers=self.headers, json=json).json() return json_data def isLike(self, photoAuthorId, photoId): \"\"\" 点赞操作 :param photoAuthorId: 作品的作者id :param photoId: 作品id :return: \"\"\" json = { \'operationName\': \"visionVideoLike\", \'query\': \"mutation visionVideoLike($photoId: String, $photoAuthorId: String, $cancel: Int, $expTag: String) {\\n visionVideoLike(photoId: $photoId, photoAuthorId: $photoAuthorId, cancel: $cancel, expTag: $expTag) {\\n result\\n __typename\\n }\\n}\\n\", \'variables\': { \'cancel\': 0, \'expTag\': \"1_a/2001481596260506114_xpcwebsearchxxnull0\", \'photoAuthorId\': photoAuthorId, \'photoId\': photoId } } json_data = requests.post(url=self.url, headers=self.headers, json=json).json() return json_data def postComment(self, content, photoAuthorId, photoId): \"\"\" 发布评论 :param content: 评论内容 :param photoAuthorId: 作品的作者id :param photoId: 作者id :return: \"\"\" json = { \'operationName\': \"visionAddComment\", \'query\': \"mutation visionAddComment($photoId: String, $photoAuthorId: String, $content: String, $replyToCommentId: ID, $replyTo: ID, $expTag: String) {\\n visionAddComment(photoId: $photoId, photoAuthorId: $photoAuthorId, content: $content, replyToCommentId: $replyToCommentId, replyTo: $replyTo, expTag: $expTag) {\\n result\\n commentId\\n content\\n timestamp\\n status\\n __typename\\n }\\n}\\n\", \'variables\': { \'content\': content, \'expTag\': \"1_a/2001481596260506114_xpcwebsearchxxnull0\", \'photoAuthorId\': photoAuthorId, \'photoId\': photoId } } json_data = requests.post(url=self.url, headers=self.headers, json=json).json() return json_data def getComment(self, photoId, pcursor): \"\"\" 获取评论 :param photoId: 作品id :param pcursor: 页码 :return: 评论内容 \"\"\" json = { \'operationName\': \"commentListQuery\", \'query\': \"query commentListQuery($photoId: String, $pcursor: String) {\\n visionCommentList(photoId: $photoId, pcursor: $pcursor) {\\n commentCount\\n pcursor\\n rootComments {\\n commentId\\n authorId\\n authorName\\n content\\n headurl\\n timestamp\\n likedCount\\n realLikedCount\\n liked\\n status\\n subCommentCount\\n subCommentsPcursor\\n subComments {\\n commentId\\n authorId\\n authorName\\n content\\n headurl\\n timestamp\\n likedCount\\n realLikedCount\\n liked\\n status\\n replyToUserName\\n replyTo\\n __typename\\n }\\n __typename\\n }\\n __typename\\n }\\n}\\n\", \'variables\': {\'photoId\': photoId, \'pcursor\': str(pcursor)} } json_data = requests.post(url=self.url, headers=self.headers, json=json).json() return json_data def getUserInfo(self, userId): \"\"\" 获取用户信息 :param userId: 用户id :return: \"\"\" json = { \'operationName\': \"visionProfile\", \'query\': \"query visionProfile($userId: String) {\\n visionProfile(userId: $userId) {\\n result\\n hostName\\n userProfile {\\n ownerCount {\\n fan\\n photo\\n follow\\n photo_public\\n __typename\\n }\\n profile {\\n gender\\n user_name\\n user_id\\n headurl\\n user_text\\n user_profile_bg_url\\n __typename\\n }\\n isFollowing\\n __typename\\n }\\n __typename\\n }\\n}\\n\", \'variables\': {\'userId\': userId} } json_data = requests.post(url=self.url, headers=self.headers, json=json).json() return json_data def getUserPhoto(self, userId, pcursor): \"\"\" 获取用户作品 :param userId: 用户id :param pcursor: 页码参数 :return: \"\"\" json = { \'operationName\': \"visionProfilePhotoList\", \'query\': \"fragment photoContent on PhotoEntity {\\n id\\n duration\\n caption\\n likeCount\\n viewCount\\n realLikeCount\\n coverUrl\\n photoUrl\\n photoH265Url\\n manifest\\n manifestH265\\n videoResource\\n coverUrls {\\n url\\n __typename\\n }\\n timestamp\\n expTag\\n animatedCoverUrl\\n distance\\n videoRatio\\n liked\\n stereoType\\n profileUserTopPhoto\\n __typename\\n}\\n\\nfragment feedContent on Feed {\\n type\\n author {\\n id\\n name\\n headerUrl\\n following\\n headerUrls {\\n url\\n __typename\\n }\\n __typename\\n }\\n photo {\\n ...photoContent\\n __typename\\n }\\n canAddComment\\n llsid\\n status\\n currentPcursor\\n __typename\\n}\\n\\nquery visionProfilePhotoList($pcursor: String, $userId: String, $page: String, $webPageArea: String) {\\n visionProfilePhotoList(pcursor: $pcursor, userId: $userId, page: $page, webPageArea: $webPageArea) {\\n result\\n llsid\\n webPageArea\\n feeds {\\n ...feedContent\\n __typename\\n }\\n hostName\\n pcursor\\n __typename\\n }\\n}\\n\", \'variables\': {\'userId\': userId, \'pcursor\': pcursor, \'page\': \"profile\"} } json_data = requests.post(url=self.url, headers=self.headers, json=json).json() return json_data if __name__ == \'__main__\': kuaishou = KuaiShou()
尾语 💝
感谢你观看我的文章呐~本次航班到这里就结束啦 🛬
希望本篇文章有对你带来帮助 🎉,有学习到一点知识~
躲起来的星星🍥也在努力发光,你也要努力加油(让我们一起努力叭)。
最后,博主要一下你们的三连呀(点赞、评论、收藏),不要钱的还是可以搞一搞的嘛~
不知道评论啥的,即使扣个6666也是对博主的鼓舞吖 💞 感谢 💐
来源:https://www.cnblogs.com/Qqun261823976/p/16543443.html
本站部分图文来源于网络,如有侵权请联系删除。