Question

我是Python编码的新手，但是到目前为止，使用bs4编码了一些简单的scraper。我在一个特定的项目上遇到了麻烦：

page = requests.get("http://www.radarindustrial.com.br/empresa/19640/")
soup = BeautifulSoup(page.content, 'html.parser')

web = soup.find_all(href = True, id = "contatos")

它返回[]。当我只尝试

web = soup.find_all(id = "contatos")

它返回（正确地）我需要的div，它包含一个href（我插入点只是为了显示我需要的代码部分，即URL）

<。a href =“ / Redirect.aspx？cid = 19640＆url = http://www.ashtarbrindes.com.br” target =“]

我尝试了“ web.a”，find（“ a”，id =“ contatos”）和其他方法，但是它返回的是空列表或“无”。

我在搞什么？

Answer 1

您可以使用

.find("div", {"id": "contatos"})

使用div，您将提取id等于contatos的{{1}}，然后.select_one('a["href"]')将在包含a和div的{{1}}将访问href属性值。

Answer 2

好吧，如果我们对CSS没问题，那怎么样：

soup.select_one('div#contatos a[href]')['href']

美丽的汤发现href无效

2 个答案: