Question

我在这条路上有一只sc spider蜘蛛：

define("SPIDER_PATH", "C:\\Users\\[USERNAME]\\test1\\test1\\spiders\\test.py");

现在我尝试通过php启动脚本：

if (is_numeric(filter_input(INPUT_POST, "reload"))) {
    $additional = " -a check=" . filter_input(INPUT_POST, "reload");

}
echo shell_exec("scrapy runspider " . SPIDER_PATH . $additional);

但是没有任何反应，而且没有任何东西从shell_exec回复。

我已使用wamp在本地计算机上测试过它。

任何人都可以帮助我吗？

环境变量设置正确（至少我可以通过Windows cmd.exe调用完全相同的命令

Answer 1

你不能像你那样通过php运行scrapy。

你需要的是报废。

https://scrapyd.readthedocs.org/en/latest/install.html

安装后。去你的scrapy项目目录： C：\ Users \ [用户名] \ TEST1 \

创建/编辑包含内容的scrapy.cfg文件：

[settings]
default = crawler.settings

[deploy]
url = http://localhost:6800/
project = crawler

运行命令

scrapyd-deploy -l

将列出您的可用目标：

default              http://localhost:6800/

现在您需要部署项目：

scrapyd-deploy default -p test1

有关部署项目的更多信息： https://scrapyd.readthedocs.org/en/latest/deploy.html

部署项目时，您可以使用curl请求来设置蜘蛛：

curl http://localhost:6800/schedule.json -d project=test1 -d spider=test

有关scrapyd API的更多信息： https://scrapyd.readthedocs.org/en/latest/api.html

Answer 2

您需要先致电chdir()。

chdir("C:\\Users\\[USERNAME]\\test1\\test1\\spiders\\test.py");
echo shell_exec("scrapy runspider " . $additional);

通过shell_exec调用的Scrapy脚本不会执行

2 个答案: