Question

我试图获得美国各州的缩写，但是这段代码：

{{1}}

返回AttributeError：＆＃39; Nonetype＆＃39;对象没有属性＆＃39; find_all＆＃39;。我的代码有什么问题？

Answer 1

你走了。

我将source.find中的类更改为'wikitable sortable'。此外，方法abbs.get_text()给了我一个错误，所以我只使用生成器函数来获取你想要的文本。

from bs4 import BeautifulSoup
from urllib.request import urlopen

web = urlopen('https://simple.wikipedia.org/wiki/List_of_U.S._states')
source = BeautifulSoup(web, 'lxml')
table = source.find(class_='wikitable sortable').find_all('b')
b_arr = '\n'.join([x.text for x in table])
print(b_arr)

部分输出：

AL
AK
AZ
AR
CA
CO

Answer 2

如Patrick所示，

source.first（）仅返回第一个元素。

参考的 first（）方法的源代码：

def find(self, name=None, attrs={}, recursive=True, text=None, **kwargs):
    """Return only the first child of this Tag matching the given criteria."""
    r = None
    l = self.find_all(name, attrs, recursive, text, 1, **kwargs)
    if l:
        r = l[0]
    return r
findChild = find

提取表后，类名为wikitable sortable 因此，根据上面的代码，它返回None。

因此您可能希望将代码更改为...

from bs4 import BeautifulSoup
from urllib.request import urlopen

url = 'https://simple.wikipedia.org/wiki/List_of_U.S._states'
web = urlopen(url)
source = BeautifulSoup(web, 'html.parser')

table = source.find('table', class_='wikitable')
abbs = table.find_all('b')

abbs_list = [i.get_text().strip() for i in abbs]
print(abbs_list)

我希望它能回答你的问题。：）

Answer 3

根据评论中的建议，网址上的HTML没有包含该类的表

'wikitable sortable jquery-tablesorter'

但课程实际上是

'wikitable sortable'

同样，一旦你应用了find_all，它就会返回一个包含所有标签的列表，这样你就不能直接对它应用get_text（）。您可以使用列表推导来删除列表中每个元素的文本。这是适用于您的问题的代码

from bs4 import BeautifulSoup
from urllib.request import urlopen
url='https://simple.wikipedia.org/wiki/List_of_U.S._states'
web=urlopen(url)
source=BeautifulSoup(web, 'html.parser')
table=source.find('table', {'class': 'wikitable sortable'})
abbs=table.find_all('b')
values = [ele.text.strip() for ele in abbs]
print(values)

Python属性错误：＆＃39; NoneType＆＃39;对象没有属性＆＃39; find_all＆＃39;

3 个答案: