我正在尝试从以下网址获得回复。我尝试使用下面的代码获取数据,但不幸的是它返回的是空字符串。
url = 'https://covid19index.in/wp-admin/admin-ajax.php?action=get_wdtable&table_id=21'
params = {
'action': 'get_wdtable',
'table_id': 21
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.122 Safari/537.36',
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'X-Requested-With': 'XMLHttpRequest',
'Connection': 'keep-alive',
'Host': 'covid19index.in',
'Cookie': '_ga=GA1.2.312051040.1587970650; _gid=GA1.2.1069938183.1587970650; _gat=1',
'Sec-Fetch-Dest': 'empty',
'Sec-Fetch-Mode': 'cors',
'Referer': 'https://covid19index.in/district-wise-cases/',
'Origin': 'https://covid19index.in'
}
s = requests.Session()
s.mount('http://', HTTPAdapter(max_retries=3))
time.sleep(2)
try:
content = s.post(url,data= params, headers=headers)
except requests.exceptions.TooManyRedirects:
try:
for _ in range(10):
content= s.post(url,data= params, headers= headers)
except:
print('Failed: ', 'Too many Requets and redirect')
sys.exit()
当我打印content.text时,它返回''(空字符串)
我已经尝试了所有可能的方法来获取输出,但是我无法获取输出。如果对此有任何帮助,将不胜感激。
答案 0 :(得分:1)
实际上,我可以使用浏览器检查器从该服务器接收数据。这是通过将浏览器中的cURL
请求复制到Python转换器https://curl.trillworks.com/
import requests
cookies = {
'wordpress_test_cookie': 'WP+Cookie+check',
'_ga': 'GA1.2.1193786784.1588060314',
'_gid': 'GA1.2.1185668591.1588060314',
'_gat': '1',
}
headers = {
'Connection': 'keep-alive',
'Accept': 'application/json, text/javascript, */*; q=0.01',
'X-Requested-With': 'XMLHttpRequest',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.92 Safari/537.36',
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'Origin': 'https://covid19index.in',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Dest': 'empty',
'Referer': 'https://covid19index.in/district-wise-cases/',
'Accept-Language': 'fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7',
}
params = (
('action', 'get_wdtable'),
('table_id', '21'),
)
data = {
'draw': '1',
'columns[0][data]': '0',
'columns[0][name]': 'state',
'columns[0][searchable]': 'true',
'columns[0][orderable]': 'false',
'columns[0][search][value]': '',
'columns[0][search][regex]': 'false',
'columns[1][data]': '1',
'columns[1][name]': 'district',
'columns[1][searchable]': 'true',
'columns[1][orderable]': 'false',
'columns[1][search][value]': '',
'columns[1][search][regex]': 'false',
'columns[2][data]': '2',
'columns[2][name]': 'date',
'columns[2][searchable]': 'true',
'columns[2][orderable]': 'false',
'columns[2][search][value]': '',
'columns[2][search][regex]': 'false',
'columns[3][data]': '3',
'columns[3][name]': 'date_total',
'columns[3][searchable]': 'true',
'columns[3][orderable]': 'false',
'columns[3][search][value]': '',
'columns[3][search][regex]': 'false',
'start': '0',
'length': '-1',
'search[value]': '',
'search[regex]': 'false',
'wdtNonce': 'cee9844d13'
}
response = requests.post('https://covid19index.in/wp-admin/admin-ajax.php', headers=headers, params=params, cookies=cookies, data=data)
数据可以访问:
import ast
>>> ast.literal_eval(response.content.decode("utf-8"))["data"]
[['Andaman and Nicobar Islands',
'North and Middle Andaman',
'27\\/03\\/2020',
'1'],
['Andaman and Nicobar Islands', 'South Andaman', '26\\/03\\/2020', '1'],
['Andaman and Nicobar Islands', 'South Andaman', '27\\/03\\/2020', '4'],
['Andaman and Nicobar Islands', 'South Andaman', '28\\/03\\/2020', '3'],
['Andaman and Nicobar Islands', 'South Andaman', '30\\/03\\/2020', '1'],
['Andaman and Nicobar Islands', 'South Andaman', '08\\/04\\/2020', '1'],
['Andaman and Nicobar Islands', 'South Andaman', '17\\/04\\/2020', '1'],
['Andaman and Nicobar Islands', 'South Andaman', '18\\/04\\/2020', '2'],
['Andaman and Nicobar Islands', 'South Andaman', '19\\/04\\/2020', '1'],
['Andaman and Nicobar Islands', 'South Andaman', '20\\/04\\/2020', '1'],
['Andaman and Nicobar Islands', 'South Andaman', '21\\/04\\/2020', '1'],
['Andaman and Nicobar Islands', 'South Andaman', '22\\/04\\/2020', '1'],
['Andaman and Nicobar Islands', 'South Andaman', '23\\/04\\/2020', '4'],
...