Solr字段如何在Spring Data Solr中映射?

时间:2016-11-26 14:14:09

标签: solr spring-boot spring-data-jpa nutch

我正在尝试使用Spring Data Solr从后端Solr服务器查询内容,该服务器具有以下schema.xml(为了简单起见,仅显示字段),这是从Nutch {{1}复制的}。这意味着我使用Nutch爬网,然后将段传递给Solr:

schema.xml

现在,查看Spring Data Solr的文档,例如:

  

http://docs.spring.io/spring-data/solr/docs/1.4.x/reference/html/#reference

他们使用的方法和字段似乎与我的架构中的字段不匹配。例如,在他们的文档中:

...

<fields>

    <!--APPARENTLY THE ONLY FIELD WHICH IS REQUIRED!!! -->
    <field name="id" type="string" stored="true" indexed="true" required="true"/>

    <field name="_version_" type="long" indexed="true" stored="true"/>

    <!-- core fields -->
    <field name="segment" type="string" stored="true" indexed="false"/>
    <field name="digest" type="string" stored="true" indexed="false"/>
    <field name="boost" type="float" stored="true" indexed="false"/>

    <!-- fields for index-basic plugin -->
    <field name="host" type="url" stored="false" indexed="true"/>
    <field name="url" type="url" stored="true" indexed="true"/>
    <!-- stored=true for highlighting, use term vectors  and positions for fast highlighting -->
    <field name="content" type="text_general" stored="true" indexed="true"/>
    <field name="title" type="text_general" stored="true" indexed="true"/>
    <field name="cache" type="string" stored="true" indexed="false"/>
    <field name="tstamp" type="date" stored="true" indexed="false"/>

    <!-- fields for index-geoip plugin -->
    <field name="ip" type="string" stored="true" indexed="true"/>
    <field name="cityName" type="string" stored="true" indexed="true"/>
    <field name="cityConfidence" type="int" stored="true" indexed="true"/>
    <field name="cityGeoNameId" type="int" stored="true" indexed="true"/>
    <field name="continentCode" type="string" stored="true" indexed="true"/>
    <field name="continentGeoNameId" type="int" stored="true" indexed="true"/>
    <field name="contentName" type="string" stored="true" indexed="true"/>
    <field name="countryIsoCode" type="string" stored="true" indexed="true"/>
    <field name="countryName" type="string" stored="true" indexed="true"/>
    <field name="countryConfidence" type="int" stored="true" indexed="true"/>
    <field name="countryGeoNameId" type="int" stored="true" indexed="true"/>
    <field name="latLon" type="string" stored="true" indexed="true"/>
    <field name="accRadius" type="int" stored="true" indexed="true"/>
    <field name="timeZone" type="string" stored="true" indexed="true"/>
    <field name="metroCode" type="int" stored="true" indexed="true"/>
    <field name="postalCode" type="string" stored="true" indexed="true"/>
    <field name="postalConfidence" type="int" stored="true" indexed="true"/>
    <field name="countryType" type="string" stored="true" indexed="true"/>
    <field name="subDivName" type="string" stored="true" indexed="true"/>
    <field name="subDivIsoCode" type="string" stored="true" indexed="true"/>
    <field name="subDivConfidence" type="int" stored="true" indexed="true"/>
    <field name="subDivGeoNameId" type="int" stored="true" indexed="true"/>
    <field name="autonSystemNum" type="int" stored="true" indexed="true"/>
    <field name="autonSystemOrg" type="string" stored="true" indexed="true"/>
    <field name="domain" type="string" stored="true" indexed="true"/>
    <field name="isp" type="string" stored="true" indexed="true"/>
    <field name="org" type="string" stored="true" indexed="true"/>
    <field name="userType" type="string" stored="true" indexed="true"/>
    <field name="isAnonProxy" type="boolean" stored="true" indexed="true"/>
    <field name="isSatelitteProv" type="boolean" stored="true" indexed="true"/>
    <field name="connType" type="string" stored="true" indexed="true"/>
    <field name="location" type="location" stored="true" indexed="true"/>

    <dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/>

    <!-- catch-all field -->
    <field name="text" type="text_general" stored="false" indexed="true" multiValued="true"/>

    <!-- fields for index-anchor plugin -->
    <field name="anchor" type="text_general" stored="true" indexed="true" multiValued="true"/>

    <!-- fields for index-more plugin -->
    <field name="type" type="string" stored="true" indexed="true" multiValued="true"/>
    <field name="contentLength" type="string" stored="true" indexed="false"/>
    <field name="lastModified" type="date" stored="true" indexed="false"/>
    <field name="date" type="tdate" stored="true" indexed="true"/>

    <!-- fields for languageidentifier plugin -->
    <field name="lang" type="string" stored="true" indexed="true"/>

    <!-- fields for subcollection plugin -->
    <field name="subcollection" type="string" stored="true" indexed="true" multiValued="true"/>

    <!-- fields for feed plugin (tag is also used by microformats-reltag)-->
    <field name="author" type="string" stored="true" indexed="true"/>
    <field name="tag" type="string" stored="true" indexed="true" multiValued="true"/>
    <field name="feed" type="string" stored="true" indexed="true"/>
    <field name="publishedDate" type="date" stored="true" indexed="true"/>
    <field name="updatedDate" type="date" stored="true" indexed="true"/>

    <!-- fields for creativecommons plugin -->
    <field name="cc" type="string" stored="true" indexed="true" multiValued="true"/>

    <!-- fields for tld plugin -->
    <field name="tld" type="string" stored="false" indexed="false"/>

    <!-- field containing segment's raw binary content if indexed with -addBinaryContent -->
    <field name="binaryContent" type="binary" stored="true" indexed="false"/>

</fields>

...

public interface ProductRepository extends Repository<Product, String> {
  List<Product> findByNameAndPopularity(String name, Integer popularity);
}

看着我的田地,我没有一个叫“名字”,“人气”或“可用”的字段。我错过了什么?我应该更改我的架构吗?我应该从文档中更改存储库吗?

这最后一个问题似乎很愚蠢,但是因为我看到的使用Spring Data Solr的例子只创建了一个public interface ProductRepository extends SolrRepository<Product, String> { @Query("inStock:?0") List<Product> findByAvailable(Boolean available); } 模型(我知道它是一个例子,但是示例通常反映默认情况!)及其相应的Solr存储库和它们通常包含诸如“名称”,“流行度”,“作者”之类的字段,我不确定这些字段实际代表什么或映射到哪些字段。

1 个答案:

答案 0 :(得分:0)

You need to define a Java object which mirrors your SOLR doc, using the @Field annotation.

Have a look at tutorials such as https://www.petrikainulainen.net/programming/solr/spring-data-solr-tutorial-crud-almost/