我正在编写一个程序,该程序在Google中搜索“ jopa olega”并打印第一个结果的网址
这是我正在运行的代码:
import requests, webbrowser, bs4
res = requests.get("https://www.google.com/search?q=" + "jopa olega")
res.raise_for_status()
soup = bs4.BeautifulSoup(res.text, features="html.parser")
links = soup.select('div#main > div > div > div > a')
href = links[0].get('href') # <---- problem may be here
print(href)
我希望看到的:
https://pirozhki-ru.livejournal.com/990964.html
实际输出:
/url?q=https://pirozhki-ru.livejournal.com/990964.html&sa=U&ved=2ahUKEwjppYzLgKTlAhUMxosKHS5rDmkQFjAAegQIBBAB&usg=AOvVaw0UtLIaLS93pUQMWBngtgz7
这是链接的html:
<a href="https://pirozhki-ru.livejournal.com/990964.html"
ping="/url?sa=t&source=web&rct=j&url=https://pirozhki-ru.livejournal.com/990964.html&ved=2ahUKEwiHn7P9h6TlAhURpIsKHRX5CRwQFjAAegQIAhAB">...
</a>
顺便说一句,每次输出都不同。有人知道为什么会这样吗?任何帮助表示赞赏。谢谢。