我正在尝试抓取网站,但抓取工具似乎正在随机分配我返回的数据。有时它将提供我所要求的所有数据,有时却不提供。在我的价格评估中,有时它会提供正确的数据,但其他时候返回的是不确定的。
import puppeteer from "puppeteer"
import useAddFirestore from "../hooks/useAddFirestore.js"
export default async function nikeScraper(date){
const browser = await puppeteer.launch({
headless: false
});
const page = await browser.newPage();
await page.setDefaultNavigationTimeout(0);
await page.goto("https://www.nike.com/w/sale-shoes-3yaepzy7ok");
const nikeData = []
const titles = await page.evaluate(() => {
const titles = document.querySelectorAll(".product-card__title")
const titleList = [...titles]
const text = titleList.map(title => title.innerText)
return text
})
titles.forEach((el, i) => {
nikeData[i] = {}
nikeData[i].title = el
nikeData[i].date = date
nikeData[i].brand = "Nike"
})
const links = await page.evaluate(() => {
const links = document.querySelectorAll(".product-card__img-link-overlay")
const linksList = [...links]
const href = linksList.map(link => link.href)
return href
})
links.forEach((el, i) => {
nikeData[i].link = el
})
const prices = await page.evaluate(() => {
const prices = document.querySelectorAll(".product-price__wrapper")
const priceList = [...prices]
const text = priceList.map(price => price.innerText)
return text
})
prices.forEach((el, i) => {
const splitEl = el.split("\n")
nikeData[i].sale = splitEl[0]
nikeData[i].retail = splitEl[1]
})
const images = await page.evaluate(() => {
const images = document.querySelectorAll("img")
const imageList = [...images]
const src = imageList.map(img => img.src).filter(src => src.includes("static.nike.com"))
return src
})
images.forEach((el, i) => {
nikeData[i].image = el
})
await browser.close();
for(let entry of nikeData){
useAddFirestore(entry)
}
}
我为另一个网站做了几乎相同的刮刀,并且每次都能用,所以我不知道为什么这不起作用。
示例数据返回
{
title: 'ZX 2K 4D SHOES',
brand: 'Adidas',
image: 'https://assets.adidas.com/images/w_385,h_385,f_auto,q_auto:sensitive,fl_lossy/d071967e4a624b11a32eabb300e7a801_9366/zx-2k-4d-shoes.jpg',
sale: '',
retail: undefined
}