如何通过命令行将爬虫数据发送到PHP?

时间:2016-03-23 15:01:26

标签: web-crawler import.io

我可以发送结果而不是存储在JSON文件中,将其发送给PHP吗?

我有这两个文件

settings.json

{
"outputFile" : "C:\\wamp\\www\\drestip\\admin\\crawls\\mimshoes.json",
"logFile" : "C:\\wamp\\www\\drestip\\admin\\crawls\\mimshoes.tsv",
"pause" : 1,
"local" : false,
"connections" : 3,
"cookiesEnabled" : false,
"robotsDisabled" : false,
"advancedMode" : true,
"crawlTemplate" : [ "www.mimshoes.com/" ],
"startUrls" : [ PAGES ],
"maxDepth" : 10,
"dataTemplate" : [ "www.mimshoes.com/{alpha}-{alpha}_{alpha}-{alpha}$" ],
"destination" : "JSON",
"connectorGuid" : "xxxxxxxxxxxxxxxxxxxxxxxx",
"canonicalDisabled" : false
}

user.json

{
"userGuid": "xxxxxxxxxxxxxxxxxxxx",
"apiKey": "xxxxxxxxxxxxxxx"
}

命令行:

C:\Users\creatingweb03\AppData\Roaming\import.io\import.ioc.exe -crawl settings.json user.json

1 个答案:

答案 0 :(得分:0)

如果您在settings.json中使用“target”参数,则可以将结果直接发布到API端点。

// The url that crawled data will be HTTP POSTed to
"target" : "http://localhost:9200/index/datatype",

这里有更多信息:

import.io command-line crawler