Question

如何动态加载（即向下滚动时加载更多产品）电子商务网站（paytm.com，jabong.com）。

提前致谢。

Answer 1

正如你所说，paytm是一个dyncamic网站。他们为每个事务块发送JSON请求。默认情况下，每个请求的事务数为10。

对于抓取，您需要为大量事务发出JSON post请求。作为响应，您将获得易于解析的JSON数据。我刚刚检查过，在写答案时逻辑成立。

使用python scrappy发送JSON请求

from scrapy.http import FormRequest
my_data = {'field1': 'value1', 'field2': 'value2'}
request = scrapy.Request( url, method='POST', 
                      body=json.dumps(my_data), 
                      headers={'Content-Type':'application/json'} )

我不打算编写任何代码，因为如果没有正当理由这可能是隐私泄露。

Answer 2

这是Paytm购物中心的示例。使用硒

ptm_url ='https://paytmmall.com/lenovo-ideapad-s145-8th-gen-intel-core-i5-15-6-inch-fhd-thin-and-light-laptop-8gb-1tb-dos-nvidia-mx-110-2gb-graphics-textured-black-1-86kg-81mv013sin-CMPLXLAPLENOVO-IDEAPGREE3950873DACD018-pdp?product_id=311698389&sid=f46b33ff-7ea6-4575-b6ea-cdb77e33fcf3&src=consumer_search&svc=-1&cid=6453&tracker=organic%7C%7Clenovo%7Cgrid%7CSearch_experimentName%3Dnew_ranking%7C%7C1%7Cnew_ranking&get_review_id=311698391&site_id=2&child_site_id=6'

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

driver_path = 'C:/Users/["Your Name"]/Downloads/chromedriver.exe'
options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_argument('disable-infobars')
driver=webdriver.Chrome(chrome_options=options, executable_path=driver_path)
driver.get(ptm_url)

title = driver.find_elements_by_class_name('NZJI')
price = driver.find_elements_by_class_name('_1V3w')

如何在python中动态加载电子商务网站（如paytm.com）？

2 个答案: