发布表单数据不起作用,并且由于我有关此问题的其他帖子不起作用,我认为我会再次尝试问这个问题,以便也许我可以得到另一种观点。我目前正在尝试使requests.get(url, data=q)
正常工作。打印时,找不到页面。我仅通过设置变量并将它们连接到整个URL使其起作用,但我真的想学习有关requests
的方面。我在哪里犯错?我正在为表单使用HTML标签属性name=search_terms
和name=geo_location_terms
。
search_terms = "Bars"
location = "New Orleans, LA"
url = "https://www.yellowpages.com"
q = {'search_terms': search_terms, 'geo_locations_terms': location}
page = requests.get(url, data=q)
print(page.url)
答案 0 :(得分:2)
您的代码中几乎没有小错误:
url = "https://www.yellowpages.com/search"
geo_location_terms
而不是geo_locations_terms
。requests.get
中的查询参数作为params
而不是请求数据(data
)传递。因此,代码的最终版本:
import requests
search_terms = "Bars"
location = "New Orleans, LA"
url = "https://www.yellowpages.com/search"
q = {'search_terms': search_terms, 'geo_location_terms': location}
page = requests.get(url, params=q)
print(page.url)
结果:
https://www.yellowpages.com/search?search_terms=Bars&geo_location_terms=New+Orleans%2C+LA
答案 1 :(得分:1)
除了@Lev Zakharov指出的问题外,您还需要在请求中设置Cookie,如下所示:
import requests
search_terms = "Bars"
location = "New Orleans, LA"
url = "https://www.yellowpages.com/search"
with requests.Session() as session:
session.headers.update({
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36',
'Cookie': 'cookies'
})
q = {'search_terms': search_terms, 'geo_locations_terms': location}
response = session.get(url, params=q)
print(response.url)
print(response.status_code)
输出
https://www.yellowpages.com/search?search_terms=Bars&geo_locations_terms=New+Orleans%2C+LA
200
要获取Cookie,您可以使用某些网络侦听器(例如,使用Chrome开发者工具的“网络”标签)查看请求,然后替换值'cookies'