First Scraper> INFO:抓取0页(以0页/分钟),抓取0件(以0件/分钟)

时间:2019-10-28 02:30:54

标签: scrapy

当我'scrapy shell“ URL”时,我得到了XPATH的响应,但是当我尝试运行'scrapy crawl bot'时,没有响应。

我已经尝试过.extract()、. get()和.getall()

import scrapy


class AcicbotSpider(scrapy.Spider):
    name = 'acicbot'
    allowed_domains = ['www.acichapeco.com.br']
    start_urls = ['https://www.acichapeco.com.br/associados/_busca_/?sc3=&c1&pg=0']

    def parse(self, response):
        #Extracting the content using xpath selectors
        nomes = response.xpath('//*[@id="p1344"]/a[2]/label/text()').extract()
        emails = response.xpath('//*[@id="tela"]/div/div[5]/a/text()').extract()
        sites = response.xpath('//*[@id="tela"]/div/div[6]/a/text()').extract()

        #Give the extracted content row wise
        for item in zip(emails,sites):
            #create a dictionary to store the scraped info
            scraped_info = {
                'nome':item[0],
                'email':item[1],
                'site':item[2],
            }

            #yield or give the scraped info to scrapy
            yield scraped_info

我需要提取公司名称,电子邮件和站点,还需要机器人来抓取下一页:https://www.acichapeco.com.br/associados/busca/?sc3=&c1&pg=2

0 个答案:

没有答案