Question

我正在尝试刮刮耐克商店，为此我使用了 bs4 和 html_request，代码如下：

from requests_html import HTMLSession
from bs4 import BeautifulSoup as sp
import pandas as pd

s=HTMLSession()

url='https://www.nike.com/es/w/hombre-zapatillas-nik1zy7ok'

def getData(url):
    r=s.get(url)
    r.html.render(sleep=4,scrolldown=10)
    soup=sp(r.html.html,'html.parser')
    
    return soup


def getDeals(soup):
    lista=[]
    products=soup.find_all('div',{'class':'product-card__body'})
    for product in products:
        try:
            name=product.find('a',{'class':'product-card__link-overlay'}).text.strip()
        except:
            name='Error de titulo'
        try:
            price=product.find('div',{'class':'product-card__price-wrapper'}).text.strip().replace('€','')
        except:
            price='No price'
        try:
            link=product.find('a',{'class':'product-card__link-overlay'})['href']
        except:
            link='No link'
        
        dict={
            'name':name,'price':price,'link':link
        }
        
        lista.append(dict)
        
    df=pd.DataFrame(lista)
    df.to_csv('ZapatillasNike.csv',index=False)
    
getDeals(getData(url))

问题出现在我尝试使用 r.html.render(scrolldown=x) 方法时。在这个网络的第一页中，我应该得到 24 个结果 scrolldown 并得到另外 24 个，但是我尝试使用不同的数字（1,5,10,100）多次 scrolldown 网页，我得到总是24个结果，求帮助。谢谢！

使用 html_request 抓取滚动页面

0 个答案: