因此,我写了一段代码来获取有关亚马逊产品的信息,并且使我可以获取价格并设置条件。如果满足价格条件,我将使用gmail发送消息到我自己。问题是当我使用代码获取标价时
'NoneType' object has no attribute 'get_text'
这是我的代码。并非全部,而是仅收集有关产品的信息:
url="https://www.amazon.de/Sony-DigitalKamera-Touch-Display-Vollformatsensor-KartenSlots/dp/B07B4L1PQ8/ref=sr_1_3?keywords=sony+a7&qid=1561393494&s=gateway&sr=8-3"
headers={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36"}
page=requests.get(url,headers=headers)
soup=BeautifulSoup(page.content,'html.parser')
title=soup.find(id="productTitle").get_text()
price=soup.find(id="priceblock_ourprice").get_text()
converted_price=float(price[0:6])
print(converted_price)
print(title.strip())
答案 0 :(得分:0)
在您用于抓取的页面中找不到priceblock_ourprice
。
from bs4 import BeautifulSoup
import requests
url = "https://www.amazon.de/Sony-DigitalKamera-Touch-Display-Vollformatsensor-KartenSlots/dp/B07B4L1PQ8/ref=sr_1_3?keywords=sony+a7&qid=1561393494&s=gateway&sr=8-3"
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36"}
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
title = soup.find('span', {'id': 'productTitle'}).text
print(title.strip())
答案 1 :(得分:0)
只需将用户代理更改为Mozilla,然后尝试以下代码即可。
url="https://www.amazon.de/Sony-DigitalKamera-Touch-Display-Vollformatsensor-KartenSlots/dp/B07B4L1PQ8/ref=sr_1_3?keywords=sony+a7&qid=1561393494&s=gateway&sr=8-3"
headers={"User-Agent":"Mozilla/5.0"}
page=requests.get(url,headers=headers)
soup=BeautifulSoup(page.content,'html.parser')
title=soup.find(id="productTitle").get_text()
price=soup.find(id="priceblock_ourprice").get_text()
price=price.replace(',','')
converted_price=float(price[0:6])
print(converted_price)
print(title.strip())
输出:
1.9213
Sony Alpha 7M3 E-Mount Vollformat Digitalkamera ILCE-7M3 (24,2 Megapixel, 7,6cm (3 Zoll) Touch-Display, Exmor R CMOS Vollformatsensor, XGA OLED Sucher, 2 Kartenslots, nur Gehäuse) schwarz