尝试发出get HTTP请求时python程序卡住

时间:2018-08-04 20:36:25

标签: web-scraping python-requests python-3.6 bad-request

我想通过使用sahibindenrequests library发送HTTP请求GET

任何浏览器都可以执行此任务,我想使用python来刺激它。这是我的代码:

from requests import get

headers = {
    'Host': 'www.sahibinden.com',
    'User-Agent':  'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1',
    'Cookie': 'MS1=https://www.sahibinden.com/category/en/real-estate; vid=668; cdid=FareXv19cbCTdery5b49a52a; MDR=20180606; __gfp_64b=bDEm7fs0Wb7A.7Rrxx3Vc8KWiV2tqUPA6HKxPqxMzzD.Q7; __gads=ID=c46feb38656fe808:T=1531553074:S=ALNI_MZsbdGGUmPpzuJMK0RROk--kk0Y9w; _ga=GA1.2.2138815818.1531553081; nwsh=std; showPremiumBanner=false; showCookiePolicy=true; userLastSearchSplashClosed=true; MS1=https://www.sahibinden.com/category/en/real-estate; st=a6abb06a0b0f9430ea7fdebd78bf1a15232062dddb59afb52b771d194a3529a1e30b6ca15b691061108084738973f686da6e51c3e00daf378',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'en-US,en;q=0.9',
    'Connection': 'keep-alive',
    'Upgrade-Insecure-Requests': '1',
}
res = get(
    "https://www.sahibinden.com",
    headers=headers
)
print(res.status_code)

我试图做浏览器的工作,所以我复制了浏览器设置的所有标头(标头可以在上面的代码中看到)

浏览器在不到3秒的时间内得到响应,并显示网站正常,但是我的程序得到的是:

requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='www.sahibinden.com', port=443): Read timed out. (read timeout=10)

我该如何解决这个问题?

0 个答案:

没有答案