Python爬取腾讯疫情实时数据并存储到mysql数据库-百木园

思路：

在腾讯疫情数据网站F12解析网站结构，使用Python爬取当日疫情数据和历史疫情数据，分别存储到details和history两个mysql表。

①此方法用于爬取每日详细疫情数据

1 import requests
2 import json
3 import time
4 def get_details():
5 url = \'https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5&callback=jQuery34102848205531413024_1584924641755&_=1584924641756\'
6 headers ={
7 \'user-agent\': \'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3741.400 QQBrowser/10.5.3863.400\'
8 }
9 res = requests.get(url,headers=headers)
10 #输出全部信息
11 # print(res.text)
12 response_data = json.loads(res.text.replace(\'jQuery34102848205531413024_1584924641755(\',\'\')[:-1])
13 #输出这个字典的键值 dict_keys([\'ret\', \'data\'])ret是响应值，0代表请求成功，data里是我们需要的数据
14 # print(response_data.keys())
15 \"\"\"上面已经转化过一次字典，然后获取里面的data，因为data是字符串，所以需要再次转化字典
16 print(json.loads(reponse_data[\'data\']).keys())
17 结果：
18 dict_keys([\'lastUpdateTime\', \'chinaTotal\', \'chinaAdd\', \'isShowAdd\', \'showAddSwitch\',
19 \'areaTree\', \'chinaDayList\', \'chinaDayAddList\', \'dailyNewAddHistory\', \'dailyHistory\',
20 \'wuhanDayList\', \'articleList\'])
21 lastUpdateTime是最新更新时间，chinaTotal是全国疫情总数，chinaAdd是全国新增数据，
22 isShowAdd代表是否展示新增数据，showAddSwitch是显示哪些数据，areaTree中有全国疫情数据
23 \"\"\"
24 areaTree_data = json.loads(response_data[\'data\'])[\'areaTree\']
25 temp=json.loads(response_data[\'data\'])
26 # print(temp.keys())
27 # print(areaTree_data[0].keys())
28 \"\"\"
29 获取上一级字典里的areaTree
30 然后查看里面中国键值
31 print(areaTree_data[0].keys())
32 dict_keys([\'name\', \'today\', \'total\', \'children\'])
33 name代表国家名称，today代表今日数据，total代表总数,children里有全国各地数据，我们需要获取全国各地数据，查看children数据
34 print(areaTree_data[0][\'children\'])
35 这里面是
36 name是地区名称，today是今日数据，total是总数，children是市级数据，
37 我们通过这个接口可以获取每个地区的总数据。我们遍历这个列表，取出name，这个是省级的数据，还需要获取市级数据，
38 需要取出name，children（市级数据）下的name、total(历史总数)下的confirm、heal、dead，today(今日数据)下的confirm（增加数），
39 这些就是我们需要的数据
40 \"\"\"
41 # print(areaTree_data[0][\'children\'])
42 # for province_data in areaTree_data[0][\'children\']:
43 # print(province_data)
44
45 ds= temp[\'lastUpdateTime\']
46 details=[]
47 for pro_infos in areaTree_data[0][\'children\']:
48 province_name = pro_infos[\'name\'] # 省名
49 for city_infos in pro_infos[\'children\']:
50 city_name = city_infos[\'name\'] # 市名
51 confirm = city_infos[\'total\'][\'confirm\']#历史总数
52 confirm_add = city_infos[\'today\'][\'confirm\']#今日增加数
53 heal = city_infos[\'total\'][\'heal\']#治愈
54 dead = city_infos[\'total\'][\'dead\']#死亡
55 # print(ds,province_name,city_name,confirm,confirm_add,heal,dead)
56 details.append([ds,province_name,city_name,confirm,confirm_add,heal,dead])
57 return details

来源：https://www.cnblogs.com/rainbow-1/p/14550221.html
图文来源于网络，如有侵权请联系删除。

Python爬取腾讯疫情实时数据并存储到mysql数据库

相关推荐

热门文章

相关推荐

热门文章

切换注册登录

用户名或邮箱

密码

切换登录注册

昵称

邮箱