当我'scrapy shell“ URL”时,我得到了XPATH的响应,但是当我尝试运行'scrapy crawl bot'时,没有响应。
我已经尝试过.extract()、. get()和.getall()
import scrapy
class AcicbotSpider(scrapy.Spider):
name = 'acicbot'
allowed_domains = ['www.acichapeco.com.br']
start_urls = ['https://www.acichapeco.com.br/associados/_busca_/?sc3=&c1&pg=0']
def parse(self, response):
#Extracting the content using xpath selectors
nomes = response.xpath('//*[@id="p1344"]/a[2]/label/text()').extract()
emails = response.xpath('//*[@id="tela"]/div/div[5]/a/text()').extract()
sites = response.xpath('//*[@id="tela"]/div/div[6]/a/text()').extract()
#Give the extracted content row wise
for item in zip(emails,sites):
#create a dictionary to store the scraped info
scraped_info = {
'nome':item[0],
'email':item[1],
'site':item[2],
}
#yield or give the scraped info to scrapy
yield scraped_info
我需要提取公司名称,电子邮件和站点,还需要机器人来抓取下一页:https://www.acichapeco.com.br/associados/busca/?sc3=&c1&pg=2