我是Python和Scrapy的新手,我正在学习Scrapy教程。我已经能够使用DOS界面创建我的项目并输入:
scrapy startproject dmoz
本教程稍后会引用Crawl命令:
scrapy crawl dmoz.org
但是每次我尝试运行时,我都会收到一条消息,告知这不是一个合法的命令。进一步展望它看起来我需要进入一个项目,这是我无法弄清楚的。我已经尝试将目录更改为我在startproject中创建的“dmoz”文件夹,但根本不识别Scrapy。
我确信我错过了一些明显的东西,我希望有人可以指出它。
答案 0 :(得分:8)
您必须在'startproject'文件夹中执行它。如果找到scrapy.cfg文件,您将有另一个命令。你可以在这里看到差异:
$ scrapy startproject bar
$ cd bar/
$ ls
bar scrapy.cfg
$ scrapy
Scrapy 0.12.0.2536 - project: bar
Usage:
scrapy <command> [options] [args]
Available commands:
crawl Start crawling from a spider or URL
deploy Deploy project in Scrapyd target
fetch Fetch a URL using the Scrapy downloader
genspider Generate new spider using pre-defined templates
list List available spiders
parse Parse URL (using its spider) and print the results
queue Deprecated command. See Scrapyd documentation.
runserver Deprecated command. Use 'server' command instead
runspider Run a self-contained spider (without creating a project)
server Start Scrapyd server for this project
settings Get settings values
shell Interactive scraping console
startproject Create new project
version Print Scrapy version
view Open URL in browser, as seen by Scrapy
Use "scrapy <command> -h" to see more info about a command
$ cd ..
$ scrapy
Scrapy 0.12.0.2536 - no active project
Usage:
scrapy <command> [options] [args]
Available commands:
fetch Fetch a URL using the Scrapy downloader
runspider Run a self-contained spider (without creating a project)
settings Get settings values
shell Interactive scraping console
startproject Create new project
version Print Scrapy version
view Open URL in browser, as seen by Scrapy
Use "scrapy <command> -h" to see more info about a command
答案 1 :(得分:2)
未设置PATH环境变量。
您可以通过查找“系统属性”(我的电脑&gt;属性&gt;高级系统设置)导航到“高级”选项卡并单击“环境变量”按钮,为Python和Scrapy设置PATH环境变量。在新窗口中,滚动到“系统变量”窗口中的“变量路径”,然后添加以分号分隔的以下行
C:\{path to python folder} C:\{path to python folder}\Scripts
例如
C:\Python27;C:\Python27\Scripts