我正在尝试使用以下代码在Yahoo中搜索查询:
|
但这不起作用,结果为空。如何获得10个优先搜索结果?
答案 0 :(得分:2)
您可以使用CSS选择器来查找所有链接,这些链接必须更快。
import requests
from bs4 import BeautifulSoup
query = "deep"
yahoo = "https://search.yahoo.com/search?q=" + query + "&n=" + str(10)
raw_page = requests.get(yahoo)
soup = BeautifulSoup(raw_page.text,'lxml')
for link in soup.select(".ac-algo.fz-l.ac-21th.lh-24"):
print (link.text, link['href'])
输出:
(Deep | Definition of Deep by Merriam-Webster', 'https://www.merriam-webster.com/dictionary/deep')
(Connecticut Department of Energy & Environmental Protection', 'https://www.ct.gov/deep/site/default.asp')
(Deep | Define Deep at Dictionary.com', 'https://www.dictionary.com/browse/deep')
(Deep - definition of deep by The Free Dictionary', 'https://www.thefreedictionary.com/deep')
(Deep (2017) - IMDb', 'https://www.imdb.com/title/tt4105584/')
(Deep Synonyms, Deep Antonyms | Merriam-Webster Thesaurus', 'https://www.merriam-webster.com/thesaurus/deep')
(Deep Synonyms, Deep Antonyms | Thesaurus.com', 'https://www.thesaurus.com/browse/deep')
(DEEP: Fishing - Connecticut', 'https://www.ct.gov/deep/cwp/view.asp?q=322708')
(Deep Deep Deep - YouTube', 'https://www.youtube.com/watch?v=oZhwagxWzOc')
(deep - English-Spanish Dictionary - WordReference.com', 'https://www.wordreference.com/es/translation.asp?tranword=deep')
答案 1 :(得分:1)
这是您的代码的主要问题:
使用美丽汤时,您应始终添加parser (例如BeautifulSoup(raw_page.text, "lxml")
)
您正在搜索错误的类,它是" ac-algo fz-l ac-21th lh-24"
而不是"ac-algo fz-l ac-21th lh-24"
(请注意开头的空格)
所有代码中的代码应如下所示:
import requests
from bs4 import BeautifulSoup
query = "deep"
yahoo = "https://search.yahoo.com/search?q=" + query + "&n=" + str(10)
raw_page = requests.get(yahoo)
soup = BeautifulSoup(raw_page.text, "lxml")
for link in soup.find_all(attrs={"class": " ac-algo fz-l ac-21th lh-24"}):
print(link.text, link.get('href'))
希望这会有所帮助