Question

好的，我想在网页上搜索h5标头中的第一个链接，例如
<h5><a href="http://example.org/anything/">anything</a></h5>
1.我如何告诉Python“任何东西都可以是任何东西？”
2.然后我该如何打印超链接（或标题）到Discord？

到目前为止，我已经设法使用以下方法获取网站的来源：

import requests

link = "http://www.example.com" f = requests.get(link)

print(f.text)

我知道我可以使用以下命令打印文本以使其不和谐：

@bot.command(pass_context=True)
async def latest-release(ctx):
    await bot.say("This should be the mentioned Link")

我确实已经遵循了本教程，但是我似乎导入了BeutifulSoup或BeautifulSoup ... https://www.pythonforbeginners.com/beautifulsoup/scraping-websites-with-beautifulsoup

Answer 1

如果您已经通过pip安装了bs4，那么您应该能够使用

将其导入Python 3中

from bs4 import BeautifulSoup

然后从那里将网页变成汤并且导航到超链接

header = soup.find("h5")
# or:
# header = soup.h5
# header = soup.find_all("h5")[0]
# Returns None if "h5" can't be found
link = header.a
url = link["href"]
text = link.text

以下是BeautifulSoup4

的文档

Python在网页上搜索元素并将其打印为不一致

1 个答案: