找不到蜘蛛,关键问题

时间:2019-01-03 15:25:31

标签: python-3.x web-scraping scrapy

我已经使用Scrapy制作了一个工作项目,但是那是一团糟,所以我决定做一个更精致的新项目。

这个问题已经被回答了好几次了,但是没有一个解决方案真正对我有帮助。这个错误是如此基本,我有点沮丧。

当我尝试运行我的蜘蛛时,使用“ scrapy crawl generic_spider”会收到错误消息

SELECT
tblleaverequest.lqUser,
tblusers.username,
tblleaverequest.lqTotalHours

FROM
tblusers
RIGHT OUTER JOIN tblleaverequest ON tblleaverequest.lqUser = 
tblusers.username

这是回溯:

KeyError: 'Spider not found: generic_spider'

KeyError:'找不到蜘蛛:generic_spider'

另外,我的generic_spider和设置。

Traceback (most recent call last):
  File "C:\Users\Manuel\Anaconda3\Scripts\scrapy-script.py", line 10, in <module>
sys.exit(execute())
  File "C:\Users\Manuel\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 150, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
  File "C:\Users\Manuel\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 90, in _run_print_help
func(*a, **kw)
  File "C:\Users\Manuel\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 157, in _run_command
cmd.run(args, opts)
  File "C:\Users\Manuel\Anaconda3\lib\site-packages\scrapy\commands\crawl.py", line 57, in run
self.crawler_process.crawl(spname, **opts.spargs)
  File "C:\Users\Manuel\Anaconda3\lib\site-packages\scrapy\crawler.py", line 170, in crawl
crawler = self.create_crawler(crawler_or_spidercls)
  File "C:\Users\Manuel\Anaconda3\lib\site-packages\scrapy\crawler.py", line 198, in create_crawler
return self._create_crawler(crawler_or_spidercls)
  File "C:\Users\Manuel\Anaconda3\lib\site-packages\scrapy\crawler.py", line 202, in _create_crawler
spidercls = self.spider_loader.load(spidercls)
  File "C:\Users\Manuel\Anaconda3\lib\site-packages\scrapy\spiderloader.py", line 71, in load
raise KeyError("Spider not found: {}".format(spider_name))

settings.py

import scrapy
import re
from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor
from genericScraper.items import GenericScraperItem
from scrapy.exceptions import CloseSpider
from scrapy.http import Request

class GenericScraperSpider(CrawlSpider):

    name = "generic_spider"
    #Things

    def start_requests(self)

        #More things

    def parse_item(self, response)

编辑:

树(我不知道为什么只有pycache出现,EDIT2:看来树只显示文件夹)

C:。

# -*- coding: utf-8 -*-

# Scrapy settings for genericScraper project
#
# For simplicity, this file contains only settings considered important or
# commonly used. You can find more settings consulting the documentation:
#
#     https://doc.scrapy.org/en/latest/topics/settings.html
#     https://doc.scrapy.org/en/latest/topics/downloader-middleware.html
#     https://doc.scrapy.org/en/latest/topics/spider-middleware.html

BOT_NAME = 'genericScraper'

SPIDER_MODULES = ['genericScraper.spiders']
NEWSPIDER_MODULE = 'genericScraper.spiders'

cfg

[设置] 默认= genericScraper.settings

[部署] 项目= genericScraper

3 个答案:

答案 0 :(得分:1)

通常,遇到此问题时,必须确保3件事:

  1. 您位于项目根目录(toShape :: AltShape -> Shape toShape ps = splitEvery (rowLength ps) [ (lookup row (table ps)):r | row <- allCoords (colLength ps) (rowLength ps) , row `notElem` (map coords ps) ] 在其中)
  2. 您使用scrapy.cfgscrapy.cfg中的Spider拥有正确的项目结构
  3. 您的蜘蛛是具有project/spiders/spider.py属性的有效类

终端损耗:

name

答案 1 :(得分:0)

蜘蛛文件可能不在精确的位置。它应该在蜘蛛文件夹中。

答案 2 :(得分:0)

我发现了问题!

删除

  

LOG_STDOUT =真

LOG_STDOUT = True中有settings.py,由于某种原因使抓图器看不到蜘蛛网,因此从设置中删除行解决了该问题!