我正在尝试编写我的第一个程序,该程序将从多个网站抓取网页。我想合并来自不同URL的产品在不同时间的价格。但是,我收到一个错误。下面是我的代码:
#imports
import pandas as pd
import requests
from bs4 import BeautifulSoup
#Product Websites For Consolidation
url = ['https://www.aeroprecisionusa.com/ar15/lower-receivers/stripped-lowers?product_list_limit=all', 'https://www.aeroprecisionusa.com/ar15/lower-receivers/complete-lowers?product_list_limit=all']
for website in url:
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0"}
page = requests.get(website, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
#Locating All Products On Page
all_products_on_page = soup.find(class_='products wrapper container grid products-grid')
individual_items = all_products_on_page.find_all(class_='product-item-info')
#Breaking Down Product By Name And Price
aero_product_name = [individual_items.find(class_='product-item-link').text for individual_items in individual_items]
aero_product_price = [individual_items.find(class_='price').text for individual_items in individual_items]
Aero_Stripped_Lowers_Consolidated = pd.DataFrame(
{'Aero Stripped Lower': aero_product_name,
'Prices': aero_product_price,
})
Aero_Stripped_Lowers_Consolidated.to_csv('MasterPriceTracker.csv')
以下是我的错误。看来Python的“ .text”有问题。但是,当我只抓取一个URL时,Python能够像这样运行它
Traceback (most recent call last):
File "C:/Users/The Rossatron/Documents/PyCharm_Projects/Aero Stripped Lower List/Master_Price_Tracker.py", line 16, in <module>
aero_product_price = [individual_items.find(class_='price').text for individual_items in individual_items]
File "C:/Users/The Rossatron/Documents/PyCharm_Projects/Aero Stripped Lower List/Master_Price_Tracker.py", line 16, in <listcomp>
aero_product_price = [individual_items.find(class_='price').text for individual_items in individual_items]
AttributeError: 'NoneType' object has no attribute 'text'
我想知道是否有人愿意为此提供帮助。这似乎是一个简单的错误,但我已经对其进行了数小时的故障排除。
谢谢!