这是我到目前为止的代码
for page in range(1, 5):
guitarPage
=requests.get('https://www.guitarguitar.co.uk/guitars/electric/page-'.format(page)).text
soup = BeautifulSoup(guitarPage, 'lxml')
# row = soup.find(class_='row products flex-row')
guitars = soup.find_all(class_='col-xs-6 col-sm-4 col-md-4 col-lg-3')
这是迭代产品的实际循环
for guitar in guitars:
title_text = guitar.h3.text.strip()
print('Guitar Name: ', title_text)
price = guitar.find(class_='price bold small').text.strip()
print('Guitar Price: ', price)
time.sleep(0.5)
到目前为止,该代码仅在同一页面上运行,而无需继续进行下一页。 网站URL的结构围绕page-2,page-3 ++等起作用。
答案 0 :(得分:0)
您必须在链接中添加{}。我还添加了时间模块。
import requests
from bs4 import BeautifulSoup
import time
for page in range(1, 5):
guitarPage = requests.get('https://www.guitarguitar.co.uk/guitars/electric/page-{}'.format(page)).text
soup = BeautifulSoup(guitarPage, 'lxml')
# row = soup.find(class_='row products flex-row')
guitars = soup.find_all(class_='col-xs-6 col-sm-4 col-md-4 col-lg-3')
for guitar in guitars:
title_text = guitar.h3.text.strip()
price = guitar.find(class_='price bold small').text.strip()
print('Guitar Name: ', title_text, 'Guitar Price: ', price)
time.sleep(0.5)