目前,我正在为网站的网络抓取工作,该网站在页面自动加载时需要数据。我正在使用BeautifullSoup和请求。
import requests
from bs4 import BeautifulSoup
page = requests.get("https://www.monki.com/en/newin/view-all-new.html")
soup = BeautifulSoup(page.content, 'html.parser')
article_codes=[]
for k in soup.findAll('div',attrs={"class":"producttile-details"}):
article_code = k.find('span', attrs={'class':"articleCode"})
print(article_code)
article_codes.append(article_code.text)
使用此代码,我只能获取页面的数据,但是我希望在页面加载后获取所有数据。
答案 0 :(得分:0)
该页面正在使用JavaScript加载其他页面。您可以使用requests
模块来模拟这些请求。
例如:
import requests
from bs4 import BeautifulSoup
url = 'https://www.monki.com/en_eur/newin/view-all-new/_jcr_content/productlisting.products.html'
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0',
}
with requests.session() as s:
s.get('https://www.monki.com/en_eur/newin/view-all-new.html', headers=headers).text
for page in range(0, 10): # <-- adjust to required number of pages
soup = BeautifulSoup(s.get(url, params={'offset': page*28}, headers=headers).content, 'html.parser')
for product in soup.select('.o-product'):
name = product.select_one('.product-name').get_text(strip=True)
price = product.select_one('.price-tag').get_text(strip=True)
link = product.select_one('.a-link')['href']
print('{:<50} {:<10} {}'.format(name, price , link))
打印所有产品:
NEW! Maxi smock dress €30 https://www.monki.com/en_eur/clothing/dresses/midi-dresses/product.midi-button-up-shirt-dress-black.0871799004.html
NEW! Retro skater dress €20 https://www.monki.com/en_eur/clothing/dresses/mini-dresses/product.retro-skater-dress-white.0688447029.html
NEW! Mozik block jeans €40 https://www.monki.com/en_eur/clothing/jeans/product.mozik-block-jeans-blue.0874088001.html
NEW! Pack of two scrunchies €6 https://www.monki.com/en_eur/accessories/hair-accessories/product.pack-of-two-scrunchies-beige.0530296078.html
NEW! Mini hand bag €18 https://www.monki.com/en_eur/accessories/bags,-wallets-belts/bags/product.mini-hand-bag-black.0826291006.html
NEW! Fitted crop top €10 https://www.monki.com/en_eur/clothing/tops/t-shirts/product.fitted-crop-top-purple.0906440002.html
NEW! Tiered smock dress €30 https://www.monki.com/en_eur/clothing/dresses/midi-dresses/product.tiered-smock-dress-blue.0895277004.html
NEW! Mini hand bag €18 https://www.monki.com/en_eur/accessories/bags,-wallets-belts/bags/product.mini-hand-bag-beige.0826291008.html
NEW! Fitted t-shirt €10 https://www.monki.com/en_eur/clothing/tops/t-shirts/product.fitted-t-shirt-purple.0905746002.html
NEW! Shoulder pads t-shirt dress €25 https://www.monki.com/en_eur/clothing/dresses/mini-dresses/product.shoulder-pads-t-shirt-dress-beige.0929301002.html
NEW! Yoko mid blue jeans €40 https://www.monki.com/en_eur/clothing/jeans/product.yoko-mid-blue-jeans-blue.0656425001.html
NEW! Yoko classic blue jeans €40 https://www.monki.com/en_eur/clothing/jeans/product.yoko-classic-blue-jeans-blue.0807218001.html
NEW! Pleated midi skirt €25 https://www.monki.com/en_eur/clothing/skirts/midi-skirts/product.pleated-midi-skirt-black.0562278003.html
... and so on.