当我在循环中的变量末尾添加.text时,它引发了AttributeError。我将其删除,它会打印出所有标签以及信息。我不知道为什么它不断抛出AttributeError。
感谢任何帮助,T.T
我尝试过:
biz_name = result.find('span', attrs={'itemprop':'name'}).text
和
biz_name = result.find('span', attrs={'itemprop':'name'}).text[1:-1]
这是结果的一个单元格:
<span itemprop="name">Efrain Jimenez Jr. General Contractor Inc.</span>
和脚本:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import csv
r = requests.get('https://www.yellowpages.com/search?search_terms=remodeling&geo_location_terms=New+York%2C+NY')
soup = BeautifulSoup(r.text, 'html.parser')
results = soup.find_all('div', attrs={'class':'info'})
records = []
for result in results:
biz_name = result.find('span', attrs={'itemprop':'name'})
biz_phone = result.find('div', attrs={'itemprop':'telephone'})
biz_address = result.find('span', attrs={'itemprop':'streetAddress'})
biz_city = result.find('span', attrs={'itemprop':'addressLocality'})
biz_zip = result.find('span', attrs={'itemprop':'postalCode'})
records.append((biz_name, biz_phone, biz_address, biz_city, biz_zip))
df = pd.DataFrame(records, columns=['biz_name', 'biz_phone', 'biz_address', 'biz_city', 'biz_zip'])
df.to_csv('Yp_Remodel.csv', index=False, encoding='utf-8')
答案 0 :(得分:1)
也许不是最理想的答案,但似乎在某些情况下,某些值是“ None”,因此,如果尝试访问其文本,则会出现错误。试试这个,对我有用。
import requests
from bs4 import BeautifulSoup
import pandas as pd
import csv
r = requests.get('https://www.yellowpages.com/search?search_terms=remodeling&geo_location_terms=New+York%2C+NY')
soup = BeautifulSoup(r.text, 'html.parser')
results = soup.find_all('div', attrs={'class':'info'})
records = []
for result in results:
biz_name = result.find('span', attrs={'itemprop':'name'}).text if result.find('span', attrs={'itemprop':'name'}) is not None else ''
biz_phone = result.find('div', attrs={'itemprop':'telephone'}).text if result.find('span', attrs={'itemprop':'telephone'}) is not None else ''
biz_address = result.find('span', attrs={'itemprop':'streetAddress'}).text if result.find('span', attrs={'itemprop':'streetAddress'}) is not None else ''
biz_city = result.find('span', attrs={'itemprop':'addressLocality'}).text if result.find('span', attrs={'itemprop':'addressLocality'}) is not None else ''
biz_zip = result.find('span', attrs={'itemprop':'postalCode'}).text if result.find('span', attrs={'itemprop':'postalCode'}) is not None else ''
records.append((biz_name, biz_phone, biz_address, biz_city, biz_zip))
df = pd.DataFrame(records, columns=['biz_name', 'biz_phone', 'biz_address', 'biz_city', 'biz_zip'])
df.to_csv('Yp_Remodel.csv', index=False, encoding='utf-8')