Question

我正在尝试使用Python中的request.get从URL获取Html内容。但我得到了不完整的回应。

import requests
from lxml import html


url = "https://www.expedia.com/Hotel-Search?destination=Maldives&latLong=3.480528%2C73.192127&regionId=109&startDate=04%2F20%2F2018&endDate=04%2F21%2F2018&rooms=1&_xpid=11905%7C1&adults=2"
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 
    (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36',
    'Content-Type': 'text/html',
    }

response = requests.get(url, headers=headers)
print response.content

任何人都可以建议进行更改以获得完全完整的回复。

注意：使用硒能够获得完整的反应，但这不是推荐的方法。

Answer 1

如果您需要获取由JavaScript动态生成的内容并且您不想使用Selenium，则可以尝试使用支持JavaScript的requests-html工具：

from requests_html import HTMLSession

session = HTMLSession()
url = "https://www.expedia.com/Hotel-Search?destination=Maldives&latLong=3.480528%2C73.192127&regionId=109&startDate=04%2F20%2F2018&endDate=04%2F21%2F2018&rooms=1&_xpid=11905%7C1&adults=2"
r = session.get(url)
r.html.render()

print(r.content)

使用Python request.get

1 个答案: