Question

我正在教自己Python，而不只是阅读。当我运行这个代码终端没有显示错误，但没有发生任何事情，虽然在我看来，这应该是从这个列表中检索地址。

我如何调用这些数据有什么问题吗？

import requests
from bs4 import BeautifulSoup

url = ("http://www.gym-directory.com/listing/bulldog-gym/")
r = requests.get(url)

soup = BeautifulSoup(r.content, "html.parser")
print soup.prettify()

listing_data = soup.find_all("div", {"class":"tab-contents"})
for item in listing_data:
    print item.contents[0].find_all("span",{"class":"wlt_shortcode_map_location"})[0].text

Answer 1

实际问题很简单，但您还会发现其他问题。主要问题是，您要查找的第一堂课是tab-content（否s）。

然后您会发现页面上有两个tab-content类。第二次迭代会导致异常，因为结构不同。

由于页面中只使用了wlt_shortcode_map_location，因此您只需从顶层递归查找即可。

稍后您可能会发现编码问题 - r.content是来自请求的原始字节，其中r.text是使用服务器内容类型作为指南的已解码字符串。

考虑到所有这些，以下代码似乎可以满足您的需求：

url = ("http://www.gym-directory.com/listing/bulldog-gym/")
r = requests.get(url)

soup = BeautifulSoup(r.text, 'html.parser')

print soup.find("span",{"class":"wlt_shortcode_map_location"}).text

祝你好运！

Python中Scraper的开头

1 个答案: