import scrapy
from scrapy.http import TextResponse
from selenium import webdriver
class Spider1(scrapy.Spider):
name = "len"
allowed_domains = ["support.lenovo.com"]
start_urls = ["https://support.lenovo.com/in/hi/contactus1/findaprovider/service-provider-list?countrycode=in"]
def parse(self, response):
for sel in
driver = webdriver.Firefox()
driver.get(self.start_urls)
d1 = driver.page_source.encode('utf-8')
html = str(d1)
response = TextResponse('none',200,{},html,[],None)
url = driver.current_url
elem = driver.find_element_by_class_name("page-next")
elem.click()
self.fun(url)
def fun(url): #function to extract data on each page
我试图提取所有50页的详细信息..我通过使用scrapy提取一页数据的代码,但我想提取所有50页。我知道这可以通过使用硒来完成.. 谁能告诉我逻辑或回复我的一些例子,以便我能理解如何从所有页面中提取数据..
这是链接.. https://support.lenovo.com/in/hi/contactus1/findaprovider/service-provider-list?countrycode=in
答案 0 :(得分:1)
您不需要使用Selenium,您可以通过此链接访问数据,而无需使用javascript:
只需增加页码:
SpringConfig