处理Solr的理想方式导致PHP?

时间:2010-06-04 11:49:00

标签: php solr simplexml

Firslty,我知道一些与此类似的问题,但我认为这种情况不同,足以保证自己的问题。

我正在通过LAMP服务器上的jetty安装运行Solr索引。我目前使用simplexml_load_file函数引入搜索结果,然后通过几个函数解析它们。在我遇到一个基本问题之前,我对这个过程很满意。

字段名称不会通过simplexml函数传递。例如,这个结果;

<doc>
  <float name="score">0.73325396</float>
  <str name="add1">Ravensbridge Drive</str>
  <str name="comments">0</str>
  <str name="company">Stratstone Lotus Leicester</str>
  <str name="feed_id"/>
  <str name="id">1711765</str>
  <str name="pcode">LE4 0BX</str>
  <str name="psearch">LE4</str>
  <str name="rating">0</str>
</doc>

在simplexml对象中看起来像这样;

 [doc] => Array
 (
   [0] => SimpleXMLElement Object
   (
     [float] => 0.73325396
     [str] => Array
     (
       [0] => Ravensbridge Drive
       [1] => 0
       [2] => Stratstone Lotus Leicester
       [3] => SimpleXMLElement Object
       (
         [@attributes] => Array
         (
           [name] => feed_id
         )
       )
       [4] => 1711765
       [5] => LE4 0BX
       [6] => LE4
       [7] => 0
     )
   )

当找到完整的数据集时,数组中存储了11位数据,但是当缺少某些数据时,数据会四处移动并且我的解析器会被解除。

所以,我已经看过库/类来正确地完成它。即两个主要的; Apache Solrsolr-php-client但两者看起来都过于复杂,实际的实际示例很少,看起来它们都不支持不同的solr核心,我使用了几个。

最好用的是什么?我现在已经相当坚持,任何帮助都会受到很多人的赞赏。

谢谢!

1 个答案:

答案 0 :(得分:8)

当然,请使用其中一个现有客户端。至于多核支持,就像为每个Solr实例创建客户端实例一样简单。

Solr扩展功能更强大,同时使用起来非常直观。这里有几个示例代码片段,它们使用两个库进行基本查询并获得结果:

PHP Solr extension

<?php
$options = array
(
    'hostname' => 'localhost',
    'port'     => '8080',
    'path'     => '/solr'
);

$client = new SolrClient($options);

$query = new SolrQuery();
$query->setQuery('fox');
$query->setStart(0);
$query->setRows(50);
// specify which fields do we want to retrieve
$query->addField('id')->addField('title_t')->addField('source_t');

$res = $client->query($query)->getResponse();

// how does he response look like?
var_dump($res);
/*
object(SolrObject)[4]
  public 'responseHeader' => 
    object(SolrObject)[5]
      public 'status' => int 0
      public 'QTime' => int 0
      public 'params' => 
        object(SolrObject)[6]
          public 'fl' => string 'id,title_t,source_t' (length=19)
          public 'indent' => string 'on' (length=2)
          public 'start' => string '0' (length=1)
          public 'q' => string 'fox' (length=3)
          public 'wt' => string 'xml' (length=3)
          public 'rows' => string '50' (length=2)
          public 'version' => string '2.2' (length=3)
  public 'response' => 
    object(SolrObject)[7]
      public 'numFound' => int 39
      public 'start' => int 0
      public 'docs' => 
        array
          0 => 
            object(SolrObject)[8]
              ...
          1 => 
            object(SolrObject)[9]
              ...
          2 => 
            object(SolrObject)[10]
              ...
          (...)
*/
// how does a document look like?
var_dump($res->reponse->docs[0]);
/*
object(SolrObject)[8]
  public 'id' => int 11408
  public 'source_t' => string 'CBD News Headlines' (length=18)
  public 'title_t' => string 'Hunting across Southeast Asia weakens forests' survival' (length=55)
*/

solr-php-clientofficial example of use

require_once 'library/SolrPhpClient/Apache/Solr/Service.php';

$solr = new Apache_Solr_Service('localhost', '8080', '/solr');

if (!$solr->ping()) {
    exit('Solr service not responding.');
}

$offset = 0;
$limit = 50;

$query = 'fox';
$res = $solr->search($query, $offset, $limit);

// how does he response look like?
var_dump($res->response);

/*
object(stdClass)[6]
  public 'numFound' => int 39
  public 'start' => int 0
  public 'docs' => 
    array
      0 => 
        object(Apache_Solr_Document)[46]
          protected '_documentBoost' => boolean false
          protected '_fields' => 
            array
              ...
          protected '_fieldBoosts' => 
            array
              ...
      1 => 
        object(Apache_Solr_Document)[47]
          protected '_documentBoost' => boolean false
          protected '_fields' => 
            array
              ...
          protected '_fieldBoosts' => 
            array
              ...
     (...)
*/

// how does a document look like?
var_dump($res->response->doc[0]);

/*
object(Apache_Solr_Document)[46]
  protected '_documentBoost' => boolean false
  protected '_fields' => 
    array
      'publicationTime_i' => int 1257724800
      'publicationDate_t' => string 'Mon, 9 Nov 2009' (length=15)
      'url_s' => string 'http://news.mongabay.com/2009/1108-hance_corlett.html' (length=53)
      'language_s' => string 'EN' (length=2)
      'title_t' => string 'Hunting across Southeast Asia weakens forests' survival' (length=55)
      'text' => string 'A large flying fox eats a fruit ingesting its seeds.' (length=52)
      'id' => int 11408
      'relevance_i' => int 27
      'source_t' => string 'CBD News Headlines' (length=18)
  protected '_fieldBoosts' => 
    array
      'publicationTime_i' => boolean false
      'publicationDate_t' => boolean false
      'url_s' => boolean false
      'language_s' => boolean false
      'title_t' => boolean false
      'text' => boolean false
      'id' => boolean false
      'relevance_i' => boolean false
      'source_t' => boolean false
*/