Question

我正在尝试查找标记中是否存在特定标题，如果它不包含它，则在t变量中打印文本。到目前为止，我可以用标签拉出标签的整个'td'：

for t in soup.findAll("td",{"class" : "someClass"}):

但是当我使用时：

title = "someTitle"
if title in t:
   print "contains title"
else:
   print "doesn't contain title

似乎没有寻找标题的存在并且无论如何都要通过一切。我做错了什么？

示例HTML：

<html>
 <body>
  <td class="someClass">
  <td>
   Text
  </td>
  <img title ="someTitle">
  </td>
 </body>
</html>

Answer 1

虽然<td>不能嵌套<td>，但我们可以在这里以某种方式提取图片标题。

Python 2代码：

from BeautifulSoup import BeautifulSoup as bs
html = '''
<html>
 <body>
  <td class="someClass">
  <td>
   Text
  </td>
  <img title ="someTitle">
  </td>
 </body>
</html>
'''
soup = bs(html)
tds = soup.findAll("td",{"class":"someClass"})
for td in tds:
    td_pretty = td.prettify()
    td_split_list = [line.strip() for line in td_pretty.split("\n")]
    img = bs(td_split_list[4])
    print img.find("img").get("title")

<强>输出：

someTitle

我们使用了BeautifulSoup的美化方法（see documentation here）

查找标签BS4中是否存在标题

1 个答案: