Question

我正在尝试从网站上获取纽约市的美食： https://en.wikipedia.org/wiki/Cuisine_of_New_York_City

我收到错误消息：“ NoneType”对象没有属性“ find_all”

这是我尝试过的代码。

website_url = requests.get('https://en.wikipedia.org/wiki/Cuisine_of_New_York_City').text

soup = BeautifulSoup(website_url,'lxml')
table = soup.find('table',{'class':'wikitable sortable'})

headers = [header.text for header in table.find_all('th')]

table_rows = table.find_all('tr')        
rows = []
for row in table_rows:
   td = row.find_all('td')
   row = [row.text for row in td]
   rows.append(row)

with open('BON2_POPULATION1.csv', 'w') as f:
   writer = csv.writer(f)
   writer.writerow(headers)
   writer.writerows(row for row in rows if row)

Answer 1

由于汤的原因，您遇到此错误。发现找不到具有table属性的任何class:wikitable sortable标签，因此该标签不返回

website_url = requests.get('https://en.wikipedia.org/wiki/Cuisine_of_New_York_City').text

soup = BeautifulSoup(website_url,'lxml')
table = soup.find('table',{'class':'wikitable sortable'})

headers = [header.text for header in table.find_all('th')]
if table is None:
    #handle something here when table is not present in your html.
else:
    table_rows = table.find_all('tr')        
    rows = []
    for row in table_rows:
        td = row.find_all('td')
    row = [row.text for row in td]
    rows.append(row)

    with open('BON2_POPULATION1.csv', 'w') as f:
       writer = csv.writer(f)
       writer.writerow(headers)
       writer.writerows(row for row in rows if row)

Answer 2

我看不到带有该描述的元素。首先，您可以将:contains与bs4 4.7.1+一起使用，并捕获类mw-headline中包含其innerText/text中单词Cuisine的元素。清单是需要一点清洁的。如果您想提供更具体的信息，则需要更多信息。

import requests
from bs4 import BeautifulSoup as bs

r = requests.get('https://en.wikipedia.org/wiki/Cuisine_of_New_York_City')
soup = bs(r.content, 'lxml')
cuisines_dirty = [i.text for i in soup.select('.mw-headline:contains(cuisine)')]
#perform some sort of cleaning on list

脏列表：

收到此错误：'NoneType'对象没有属性'find_all'

2 个答案: