Solr自动生成的id不起作用

时间:2013-07-14 14:56:18

标签: solr

我想为我的solr文档自动生成id,我完全像在Solr Cook Book中那样,但它不起作用。我得到了这个异常(在Jetty上运行默认值)。

ERROR org.apache.solr.core.CoreContainer  – Unable to create core: collection1
org.apache.solr.common.SolrException: QueryElevationComponent requires the schema to have a uniqueKeyField.
    at org.apache.solr.core.SolrCore.<init>(SolrCore.java:821)
    at org.apache.solr.core.SolrCore.<init>(SolrCore.java:618)
    at org.apache.solr.core.CoreContainer.createFromLocal(CoreConta

我错过了什么吗?

我的schema.xml:

    <?xml version="1.0" encoding="UTF-8" ?>
<schema name="transcripts" version="1.5"> 

<fields>   
   <field name="id" type="uuid" indexed="true" stored="true" default="NEW" required="true"/>
   <field name="stime" type="long" indexed="true" stored="true" required="true" multiValued="false"/>
   <field name="etime" type="long" indexed="true" stored="true" required="true" multiValued="false"/>
   <field name="speakerid" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
   <field name="speakergender" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
   <field name="videoid" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
   <field name="transcriptLIUM" type="text_en_splitting" indexed="true" stored="true" multiValued="false" required="false"/>
   <field name="transcriptLIMSI" type="text_en_splitting" indexed="true" stored="true" multiValued="false" required="true"/>

  <field name="_version_" type="long" indexed="true" stored="true"/>
 </fields>

 <types>
  <fieldType name="uuid" class="solr.UUIDField" indexed="true" /> 
  <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
  <fieldType name="string" class="solr.StrField" sortMissingLast="true" /> 

    <!-- A text field with defaults appropriate for English, plus
     aggressive word-splitting and autophrase features enabled.
     This field is just like text_en, except it adds
     WordDelimiterFilter to enable splitting and matching of
     words on case-change, alpha numeric boundaries, and
     non-alphanumeric chars.  This means certain compound word
     cases will work, for example query "wi fi" will match
     document "WiFi" or "wi-fi".
        -->

<fieldType name="text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <!-- TODO zde nahradi nas THD tokenizer - use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->

        <!-- Case insensitive stop word removal.
          add enablePositionIncrements=true in both the index and query
          analyzers to leave a 'gap' for more accurate phrase queries.
        -->
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="lang/stopwords_en.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="lang/stopwords_en.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
    </fieldType>

    <!-- Less flexible matching, but less false matches.  Probably not ideal for product names,
         but may be good for SKUs.  Can insert dashes in the wrong place and still match. -->
    <fieldType name="text_en_splitting_tight" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.EnglishMinimalStemFilterFactory"/>
        <!-- this filter can remove any duplicate tokens that appear at the same position - sometimes
             possible with WordDelimiterFilter in conjuncton with stemming. -->
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>  
 </types>   

</schema>

2 个答案:

答案 0 :(得分:1)

如果要保持查询提升,请阅读UniqueKey Wiki。特别是“UUID技术”部分。

答案 1 :(得分:0)

查询提升需要您在schema.xml中定义唯一的键元素。

<uniqueKey>fileid</uniqueKey>

此外,唯一键应该是唯一的,因为在您的情况下,默认值为NEW,可能不是唯一的。

另请注意

  1. Solr不需要唯一的密钥,也可以正常工作
  2. 如果您不需要查询高程组件,那么您就可以摆脱它。