Python Shell没有运行Scrapy

时间:2014-07-06 00:50:04

标签: python scrapy

我在Windows Vista 64位上运行Python.org版本2.7 64位以使用Scrapy。我有一些代码在我通过命令外壳程序运行时有效(除了Command Shell无法识别非Unicode字符的一些问题),但是当我尝试通过Python IDLE运行脚本时,我收到以下错误消息:

Warning (from warnings module):
  File "C:\Python27\mrscrap\mrscrap\spiders\test.py", line 24
    class MySpider(BaseSpider):
ScrapyDeprecationWarning: __main__.MySpider inherits from deprecated class scrapy.spider.BaseSpider, please inherit from scrapy.spider.Spider. (warning only on first subclass, there may be others)

用于生成此错误的代码是:

from scrapy.spider import BaseSpider
from scrapy.selector import Selector
from scrapy.utils.markup import remove_tags
import re

class MySpider(BaseSpider):
    name = "wiki"
    allowed_domains = ["wikipedia.org"]
    start_urls = ["http://en.wikipedia.org/wiki/Asia"]

    def parse(self, response):
        titles = response.selector.xpath("normalize-space(//title)")
        for titles in titles:

            body = response.xpath("//p").extract()
            body2 = "".join(body)
            print remove_tags(body2)

首先,在Command Shell中正常工作时出现此错误的原因是什么?其次,当我按照错误中的说明并用代码'Spider'替换代码中的两个BaseSpider实例时,代码在Python shell中运行,但什么都不做。没有错误,没有打印到日志,没有错误或警告,没有。

有谁能告诉我为什么这个修订版的代码不会将它的输出打印到Python IDLE?

由于

1 个答案:

答案 0 :(得分:1)

from scrapy.cmdline import execute添加到您的导入

然后放execute(['scrapy','crawl','wiki'])并运行你的脚本。

from scrapy.spider import Spider
from scrapy.selector import Selector
from scrapy.utils.markup import remove_tags
import re
from scrapy.cmdline import execute
class MySpider(Spider):
    name = "wiki"
    allowed_domains = ["wikipedia.org"]
    start_urls = ["http://en.wikipedia.org/wiki/Asia"]

    def parse(self, response):
        titles = response.selector.xpath("normalize-space(//title)")
        for title in titles:

            body = response.xpath("//p").extract()
            body2 = "".join(body)
            print remove_tags(body2)

execute(['scrapy','crawl','wiki'])