我正在尝试从网站上获取纽约市的美食: https://en.wikipedia.org/wiki/Cuisine_of_New_York_City
我收到错误消息:“ NoneType”对象没有属性“ find_all”
这是我尝试过的代码。
website_url = requests.get('https://en.wikipedia.org/wiki/Cuisine_of_New_York_City').text
soup = BeautifulSoup(website_url,'lxml')
table = soup.find('table',{'class':'wikitable sortable'})
headers = [header.text for header in table.find_all('th')]
table_rows = table.find_all('tr')
rows = []
for row in table_rows:
td = row.find_all('td')
row = [row.text for row in td]
rows.append(row)
with open('BON2_POPULATION1.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow(headers)
writer.writerows(row for row in rows if row)
答案 0 :(得分:0)
table
属性的任何class:wikitable sortable
标签,因此该标签不返回website_url = requests.get('https://en.wikipedia.org/wiki/Cuisine_of_New_York_City').text
soup = BeautifulSoup(website_url,'lxml')
table = soup.find('table',{'class':'wikitable sortable'})
headers = [header.text for header in table.find_all('th')]
if table is None:
#handle something here when table is not present in your html.
else:
table_rows = table.find_all('tr')
rows = []
for row in table_rows:
td = row.find_all('td')
row = [row.text for row in td]
rows.append(row)
with open('BON2_POPULATION1.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow(headers)
writer.writerows(row for row in rows if row)
答案 1 :(得分:0)
我看不到带有该描述的元素。首先,您可以将:contains
与bs4 4.7.1+一起使用,并捕获类mw-headline
中包含其innerText/text
中单词Cuisine的元素。清单是需要一点清洁的。如果您想提供更具体的信息,则需要更多信息。
import requests
from bs4 import BeautifulSoup as bs
r = requests.get('https://en.wikipedia.org/wiki/Cuisine_of_New_York_City')
soup = bs(r.content, 'lxml')
cuisines_dirty = [i.text for i in soup.select('.mw-headline:contains(cuisine)')]
#perform some sort of cleaning on list
脏列表: