Question

我想取消招聘网站。我想在scrapy shell中做一些测试。

因此，如果我输入此

scrapy shell http://www.seek.com.au

然后如果我输入

from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor

然后它工作正常

但如果我这样做

scrapy shell http://www.seek.com.au/JobSearch?DateRange=31&SearchFrom=quick&Keywords=python&nation=3000

然后如果我输入

from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor

然后它说无效的bash命令from并退出scrapy作业并作为停止的作业进入屏幕

>>> from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
-bash: from: command not found

[5]+  Stopped                 scrapy shell http://www.seek.com.au/JobSearch?DateRange=31
[7]   Done                    Keywords=php

Answer 1

显然，你需要将你的网址用双引号括起来：

scrapy shell "http://www.seek.com.au/JobSearch?DateRange=31&SearchFrom=quick&Keywords=python&nation=3000"
>>> from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
>>> lx = SgmlLinkExtractor()

然后一切顺利（上面是我实际的shell输出）

尝试没有双引号，不起作用（提取线程继续运行，第一次按键退出bash而不改变我的视觉输出，因此给你我同样的错误）

我如何使用scrapy shell在url上使用参数

1 个答案: