Question

我正在使用BeautifulSoup解析html并尝试检索标题。

我的代码如下：

callerid = cell_list[0]
print callerid.find('a')

返回相应的锚标记，我试图提取我的标题＆＃34;从。

<a class="caller_ref" href="/tomasi/cardio/vgh/SPsdeGBHH" 
title="CDS1255S56d">identifier</a>

现在这里变得很时髦。只要我将[＆＃34; title＆＃34;]添加到我的print语句中以提取标题，

callerid = cell_list[0]
print callerid.find('a')["title"]

我得到了

AttributeError：＆＃39; NoneType＆＃39;对象没有属性＆＃39;找到＆＃39;

这怎么可能＆＃34; NoneType＆＃34;当它清楚地包含第一个例子中所示的锚标记html时，如何解析它以返回标题？

Answer 1

callerid.find('a')应为callerid.find('a').a['title'] 它可能看起来像，但callerid.find('a')实际上并不返回标签的内容！（事实上the documentation对于做什么返回的内容并不是非常有用？？）

Answer 2

试，

from bs4 import BeautifulSoup
content = '<a class="caller_ref" href="/tomasi/cardio/vgh/SPsdeGBHH" 
title="CDS1255S56d">identifier</a>'
soup = BeautifulSoup(content)
anchor = soup.find_all('a')[0]
print "title : " + (anchor.get('title'))

Answer 3

我发现了错误，我基本上是通过一个包含多行的表进行解析，所有行都有锚标记，因此print callerid.find('a')可以正常工作。

但是对于print callerid.find('a')["title"]，这一行将返回NoneType，因为我正在解析的表的第一行是没有标题标记的唯一行（19456行），这会停止所有进一步的执行。 / p>

谢谢大家的帮助。

BeautifulSoup获得标题返回＆＃39; NoneType＆＃39;对象没有属性＆＃39; getitem ＆＃39;

3 个答案:

BeautifulSoup获得标题返回＆＃39; NoneType＆＃39;对象没有属性＆＃39; __ getitem __＆＃39;

3 个答案:

BeautifulSoup获得标题返回＆＃39; NoneType＆＃39;对象没有属性＆＃39; getitem ＆＃39;