成功获得响应

Question

当尝试使用访问特定网站的请求读取响应时，我会永久挂起，这很可能是某种形式的阻止。我不确定的是，成功接收到响应的CURL与从未收到任何响应的Python get请求有何不同。

注意：curl命令应该返回错误，因为我没有发送所需的信息（例如cookie）卷曲：

curl 'https://www.yellowpages.com.au/search/listings?clue=Programmer&locationClue=All+States&pageNumber=3&referredBy=UNKNOWN&&eventType=pagination' -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0'

成功获得响应

Python：

import requests
r = requests.get('https://www.yellowpages.com.au/search/listings?clue=Programmer&locationClue=All+States&pageNumber=3&referredBy=UNKNOWN&&eventType=pagination', headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0'})

永远保持读取状态

Answer 1

发出请求的方式可能会有细微的差异。例如，Python请求将自动添加一些标头：

s = open('file.txt').read() found = re.findall(r'((dtc)+)', s, re.MULTILINE) found.sort(key=lambda x: x[0]) biggest = found.pop()[0]

（您可以通过执行'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'来查看它们）

Curl将添加：r.request.headers，但没有gzip，除非您要求。但是有问题的站点似乎支持gzip，因此问题必须出在其他地方。

建议：在您的请求中添加一个超时，并捕获可能的异常，即：

Accept: */*

Answer 2

它与python 3兼容。

const element1 = document.querySelector('#element1');
const element2 = document.querySelector('#element2');

const order = stackingOrder.compare(element1, element2);

if (order === 1)
    console.log('element1 is above element2`)
else
    console.log('element1 in under element2')

响应：

import requests
r = requests.get('https://www.yellowpages.com.au/search/listings?clue=Programmer&locationClue=All+States&pageNumber=3&referredBy=UNKNOWN&&eventType=pagination', headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0'})
print(r.headers)

Python请求挂起，而CURL没有挂起（相同请求）

成功获得响应

永远保持读取状态

2 个答案: