我在从Amazon抓取数据时尝试了该程序,但是此程序给我错误。除了get_text之外,我还尝试了extract(),并且只尝试了strip(),它们都给Attribute错误。现在请帮助我该怎么办?
import urllib.request
from bs4 import BeautifulSoup
import pymysql.cursors
a = input ('enter the item to be searched :')
a = a.replace(" ","")
html = urllib.request.urlopen("https://www.amazon.in/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords="+a)
bsObj = BeautifulSoup(html,'lxml')
recordList = bsObj.findAll('a', class_='a-link-normal a-text-normal')
connection = pymysql.connect(host='localhost',
user='root',
password='',
db='shopping',
charset='utf8mb4',
cursorclass=pymysql.cursors.DictCursor)
try:
with connection.cursor() as cursor:
for record in recordList:
name = record.find("h2", {"class": "a-size-small a-color-base s-inline s-access-title a-text-normal", }).get_text().strip()
sale_price = record.find("span", {"class": "currencyINR"}).get_text().strip()
category = record.find("span", {"class": "a-color-base a-text-bold"}).get_text().strip()
sql = "INSERT INTO `amazon` (`name`, `sale_price`, `category`) VALUES (%s, %s, %s)"
cursor.execute(sql, (name, sale_price, category))
connection.commit()
finally:
connection.close()
答案 0 :(得分:1)
就像MoxieBall在上面的评论中所说,您对record.find的调用返回了None值。 在调用后续的.get_text方法之前,请尝试检查该值
可能看起来像
raw_sale_price = record.find("span", {"class": "currencyINR"})
if raw_sale_price:
sale_price = raw_sale_price.get_text().strip()