当我运行此代码时,它给了我一个空括号。我是网络抓取新手,所以我不知道我做错了什么。
import requests
from bs4 import BeautifulSoup
url = 'https://www.amazon.com/s/ref=nb_sb_noss_1?url=search-alias%3Daps&field-keywords=laptop'
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
container = soup.findAll('li', {'class': 's-result-item celwidget '})
#btw the space is also there in the html code
print(container)
结果:
[]
我尝试的是从网站上获取html代码,并通过存储所有信息的li
标签进行抄送,这样我就可以在for循环中打印出所有信息。
如果有人想解释如何使用BeautifulSoup,我们总能说话。 谢谢你们。
答案 0 :(得分:0)
因此,抓住产品和价格的工作代码可能看起来像这样。
import requests
from bs4 import BeautifulSoup
url = 'https://www.amazon.com/s/ref=nb_sb_noss_1?url=search-alias%3Daps&field-keywords=laptop'
r = requests.get(url, headers={'User-Agent': 'Mozilla Firefox'})
soup = BeautifulSoup(r.text, 'html.parser')
container = soup.findAll('li', {'class': 's-result-item celwidget '})
for cont in container:
h2 = cont.h2.text.strip()
# Amazon lists prices in two ways. If one fails, use the other
try:
currency = cont.find('sup', {'class': 'sx-price-currency'}).text.strip()
price = currency + cont.find('span', {'class': 'sx-price-whole'}).text.strip()
except:
price = cont.find('span', {'class': 'a-size-base a-color-base'})
print('Product: {}, Price: {}'.format(h2, price))
如果这有助于您进一步帮助我......