我正在尝试使用 python 的请求模块从特定链接(请参阅下面的 python 代码)获取 JSON 响应。 当我在 Firefox 的 RESTer 中测试链接(或只是将其复制到浏览器的地址栏中)时,它会返回应有的信息:
fetchJSON_comment98({"productAttr":null,"productCommentSummary":{"skuId":100020974898,"averageScore":5,"defaultGoodCount":0,"defaultGoodCountStr":"10��+"," commentCount":0,"commentCountStr":"10��+","goodCount":0,"goodCountStr":"2.1��+","goodRate":0.97,"goodRateShow":97,"generalCount":0 ,"generalCountStr":"200+","generalRate":0.02,"generalRateShow":2,"poorCoun ...(截断)
标题:
Firefox 的网络检查器中也显示了相同的内容: Firefox Network Inspector
但是当我从 python 3.7 尝试以下代码时:
from requests import Session
url = "https://club.jd.com/comment/productPageComments.action?callback=fetchJSON_comment98&productId=100020974898&score=0&sortType=6&page=0&pageSize=10&isShadowSku=0&fold=1"
headers = {"Host": "club.jd.com",
"Pragma": "no-cache",
"Cache-Control": "no-cache",
"User-Agent": "Mozilla/5.0"}
s = Session()
resp = s.get(url=url, headers=headers)
print(resp.text)
我收到一个 HTTP 200 响应和一个空响应正文,其中包含以下响应标头:
'date' (1890118560096) = {tuple} <class 'tuple'>: ('Date', 'Wed, 09 Jun 2021 09:33:02 GMT')
'content-type' (1890118524080) = {tuple} <class 'tuple'>: ('Content-Type', 'text/html;charset=GBK')
'transfer-encoding' (1890118269376) = {tuple} <class 'tuple'>: ('Transfer-Encoding', 'chunked')
'connection' (1890118524464) = {tuple} <class 'tuple'>: ('Connection', 'close')
'vary' (1890118560376) = {tuple} <class 'tuple'>: ('Vary', 'Accept-Encoding')
'content-encoding' (1890118568528) = {tuple} <class 'tuple'>: ('Content-Encoding', 'gzip')
'server' (1890118560880) = {tuple} <class 'tuple'>: ('Server', 'jfe')
'strict-transport-security' (1890118632112) = {tuple} <class 'tuple'>: ('Strict-Transport-Security', 'max-age=7776000')
我曾尝试使用 CookieJar 添加 cookie 或从浏览器的响应中复制它或制作我自己的,但都没有奏效。尝试了 Stackoverflow 上列出的许多解决方案,但没有成功...
请帮帮我,我做错了什么?
答案 0 :(得分:0)
问题在于用户代理标头。将标题更改为浏览器中的任何内容,代码就可以工作了。您可以在此处阅读有关用户代理标头格式的更多信息: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent
from requests import Session
url = "https://club.jd.com/comment/productPageComments.action?callback=fetchJSON_comment98&productId=100020974898&score=0&sortType=6&page=0&pageSize=10&isShadowSku=0&fold=1"
headers = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36"}
s = Session()
resp = s.get(url=url, headers=headers)
print(resp.text)