Pytho,BeautifulSoup-Web抓取“ find_all”返回NoneType

时间:2020-11-07 04:24:28

标签: python python-3.x web-scraping beautifulsoup

我正在尝试抓取该网站,以下图像是我所得到的。 url ='https://www.worldometers.info/world-population/population-by-country/'

我已经在stackoverflow上尝试了所有类似的解决方案,但是它对我不起作用

table_data=soup.find('table', {"id" : "example2"}, class_='table table-striped table-bordered dataTable no-footer')

headers = []
for i in table_data.find_all('th'):
    title = i.text
    headers.append(title)

Error message
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-129-e8b5de995a9d> in <module>
      1 table_data=soup.find('table', {"id" : "example2"}, class_='table table-striped table-bordered dataTable no-footer')
      2 headers = []
----> 3 for i in total_data.find_all('th'):
      4     title = i.text
      5     headers.append(title)

AttributeError: 'NoneType' object has no attribute 'find_all'

这是我尝试用于擦除表的代码,但它也无法正常工作。进一步的帮助

for j in table_data.find_all('tr')[1:]:
        row_data = j.find_all('td')
        row = [tr.text for tr in row_data]
        length = len(df)
        df.loc[length] = row


ValueError: cannot set a frame with no defined columns

2 个答案:

答案 0 :(得分:0)

“ findAll”是一个漂亮的汤函数,这意味着您必须使用:

soup.findAll('th')

答案 1 :(得分:0)

我已经看过页面并使用过:

table_data = soup.find('table', id="example2")
columns = [x.text for x in table_data.find("thead").find_all("th")][1:]
rows = [[x.text for x in y.find_all("td")][1:] for y in table_data.find("tbody").find_all("tr")]
dt = pd.DataFrame(rows, columns=columns)

测试它;-)