如何通过python中的selenium scrapy webdriver提取所有下一页数据

时间:2015-08-18 11:42:59

标签: python selenium webdriver scrapy

import scrapy
from scrapy.http import TextResponse
from selenium import webdriver

class Spider1(scrapy.Spider):
    name = "len"
    allowed_domains = ["support.lenovo.com"]
    start_urls = ["https://support.lenovo.com/in/hi/contactus1/findaprovider/service-provider-list?countrycode=in"]

def parse(self, response):
    for sel in 
    driver = webdriver.Firefox()
    driver.get(self.start_urls)
    d1 = driver.page_source.encode('utf-8')
    html = str(d1)
    response = TextResponse('none',200,{},html,[],None)
    url = driver.current_url

    elem = driver.find_element_by_class_name("page-next")
    elem.click()
    self.fun(url)

def fun(url): #function to extract data on each page

我试图提取所有50页的详细信息..我通过使用scrapy提取一页数据的代码,但我想提取所有50页。我知道这可以通过使用硒来完成.. 谁能告诉我逻辑或回复我的一些例子,以便我能理解如何从所有页面中提取数据..

这是链接.. https://support.lenovo.com/in/hi/contactus1/findaprovider/service-provider-list?countrycode=in

1 个答案:

答案 0 :(得分:1)