Question

我想使用BeautifulSoup迭代HTML文件并找到包含内容的标记，“首选名称” 这是我正在寻找的标签:(这是我要搜索的文件的一部分）：

 <td nowrap class="label">
    Preferred Name
    <span class="slot_labels"></span>
  </td>

我试图用这个搜索（doc是那个html文件的名称）：

 soup = BeautifulSoup(doc)
 tags = soup.fetch('td')
 for tag in tags:
     if tag.contents[0] == 'Preferred Name':
         return tag

这段代码不起作用，有人可以帮忙......？

Answer 1

内容包括空格，所以试试这个：

soup = BeautifulSoup(doc)
tags = soup.fetch('td')
for tag in tags:
    if tag.contents[0] and tag.contents[0].strip() == 'Preferred Name':
        return tag

使用BeautifulSoup迭代HTML

1 个答案: