通过PHP cURL将文档添加到Apache Solr

时间:2014-06-12 01:04:04

标签: php curl solr

我不知道自己做错了什么。记录没有添加。

这是我的代码:

$ch = curl_init("http://127.0.0.1:8983/solr/collection1/update/json?commit=true");

$data = array(
    "add" => array( "doc" => array(
        "id"   => "HW132",
        "name" => "Hello World"
    ))
);
$data_string = json_encode($data);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/json'));
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);

$response = curl_exec($ch);

以下是我从Solr得到的回复:

{"responseHeader":{"status":0,"QTime":4}}

3 个答案:

答案 0 :(得分:7)

显然,我需要让Apache Solr提交文件。它不会自动提交文档,或者我不知道如何配置它以自动提交。以下是工作示例。希望它能帮助那些有同样问题的人。

$ch = curl_init("http://127.0.0.1:8983/solr/collection1/update?wt=json");

$data = array(
    "add" => array( 
        "doc" => array(
            "id"   => "HW2212",
            "title" => "Hello World 2"
        ),
        "commitWithin" => 1000,
    ),
);
$data_string = json_encode($data);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/json'));
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);

$response = curl_exec($ch);

答案 1 :(得分:1)

所以不清楚你使用的是什么版本的Solr,3.X或4.X(它们处理提交的方式不同,但会涵盖两者)。在任何一种情况下,这些都是您可以在solrconfig.xml文件中进行的更改

对于3.x,您可以在任意数量的文档或毫秒数或两者中指定自动提交。在达到阈值后,Solr将提交您的更改,因此您不必在代码中执行此操作:

 <!-- autocommit pending docs if certain criteria are met.  Future versions may expand the available
     criteria -->
    <autoCommit>
      <maxDocs>10000</maxDocs> <!-- maximum uncommited docs before autocommit triggered -->
      <maxTime>15000</maxTime> <!-- maximum time (in MS) after adding a doc before an autocommit is triggered -->
      <openSearcher>false</openSearcher> <!-- SOLR 4.0.  Optionally don't open a searcher on hard commit.  This is useful to minimize the size of transaction logs that keep track of uncommitted updates. -->
    </autoCommit>

对于4.X,您还有SoftCommit选项。它会在同步到磁盘之前对搜索进行更改:

 <!-- SoftAutoCommit

         Perform a 'soft' commit automatically under certain conditions.
         This commit avoids ensuring that data is synched to disk.

         maxDocs - Maximum number of documents to add since the last
                   soft commit before automaticly triggering a new soft commit.

         maxTime - Maximum amount of time in ms that is allowed to pass
                   since a document was added before automaticly
                   triggering a new soft commit.
      -->

     <autoSoftCommit>
       <maxTime>1000</maxTime>
     </autoSoftCommit>

我发现在solrconfig.xml中思考并实现这些设置而不是依赖于应用程序代码级别提交会产生更可预测的结果。

有关Solr提交的更完整的讨论可以在这里找到:

http://wiki.apache.org/solr/SolrConfigXml#Update_Handler_Section http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

答案 2 :(得分:1)

我正在尝试发布xml,但我不知道为什么下面的解决方案有效。文档说我应该使用'@'加上文件路径来上传文件,但它没有用。所以我这样做了:

<?php

$url = 'http://localhost:8080/solr/update/?commit=true';
$file = realpath('/home/fabio/target_file.xml');

$header = array(
    "Content-Type: text/xml",
);

$post = file_get_contents($file);

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
curl_setopt($ch, CURLOPT_VERBOSE, TRUE); 

echo curl_exec($ch);
curl_close($ch);

我得到了OK(200)状态:

*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> POST /solr/update/?commit=true HTTP/1.1
Host: localhost:8080
Accept: */*
Content-Type: text/xml
Content-Length: 3502
Expect: 100-continue

< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 200 OK
< Server: Apache-Coyote/1.1
< Content-Type: application/xml;charset=UTF-8
< Transfer-Encoding: chunked
< Date: Tue, 19 Jan 2016 16:52:19 GMT
< 
* Connection #0 to host localhost left intact
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">269</int></lst>
</response>