我希望使用urllib在https://bigfuture.collegeboard.org
的搜索框中搜索这是我所拥有的,但它只是给我主页html:
import requests
from urllib import urlopen
from urllib import urlencode
from bs4 import BeautifulSoup
url = "https://bigfuture.collegeboard.org"
data = urlencode({'q': 'financial analyst'})
results = requests.post(url, data)
soup = BeautifulSoup(results.content, 'html.parser').encode("ascii", "ignore")
output = open('text.txt','w')
output.write(soup)
如何使用并提交到搜索框?
答案 0 :(得分:0)
您需要在网址中加入/sitesearch
个端点。如果我搜索" uconn",网站点击的网址为:
https://bigfuture.collegeboard.org/sitesearch?q=uconn&searchType=bf_site&tp=bf_site
所以你需要做的就是将你的网址更改为:
url = "https://bigfuture.collegeboard.org/sitesearch"
还要确保关闭文件对象或使用上下文管理器with
!!
答案 1 :(得分:0)
只需在语义网址
中使用查询参数即可E.G。
searches = ['test','new search']
for search in searches:
search = search.replace(' ','+')
url = 'https://bigfuture.collegeboard.org/sitesearch?q=%s&searchType=bf_site&tp=bf_site' % (search)
print url
requests.get(url)