Solr查询结果与分析页面不匹配

时间:2017-05-20 06:08:05

标签: solr

我在Solr查询时遇到了一些问题。

我正在尝试搜索某个关键字(术语),它在分析页面中给出了正确的结果,但在查询中它不会返回与分析页面相同的结果。

我使用的是Solr 6.4.2

这些是我的solr数据:

{Id:1,Name:"Test k504"}
{Id:2,Name:"Test 504"}
{Id:3,Name:"Test k504-ink"}

搜索字词和结果如下。

Query   Term       current result(Id)    required result
1.      k504          1,3                   1,2,3
2.      504           1,2,3                 1,2,3(it's ok)
3.      k504-ink      3                     1,2,3

我的分析页面结果如下图所示。 enter image description here

schema.xml中

...
<fieldType name="solr_query" class="solr.TextField" 
positionIncrementGap="100">
  <analyzer type="index">        
    <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.WordDelimiterFilterFactory"      
generateWordParts="1" generateNumberParts="1" catenateWords="0" 
catenateNumbers="0" catenateAll="0" preserveOriginal="1"/>
    <filter class="solr.PorterStemFilterFactory"/>
  </analyzer>
  <analyzer type="query">        
    <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.WordDelimiterFilterFactory"  
generateWordParts="1" generateNumberParts="1" catenateWords="0" 
catenateNumbers="0" catenateAll="0"
            preserveOriginal="1" />
    <filter class="solr.PorterStemFilterFactory"/>
  </analyzer>
</fieldType>
 <fields>
<field name="Id" type="int" indexed="true" stored="true" required="true"  
/>
<field name="Name" type="solr_query" indexed="true" stored="true"  
 required="false" />
 </fields>
 <copyField source="Name" dest="solr_query"/>
...

更新1: Solr查询

http://localhost:8983/solr/myCore/select?fl=Name,Id&indent=on&q=Name:504&wt=json



<filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="10" />

更新2:

solrconfig.xml中

<config>  
  <luceneMatchVersion>5.0.0</luceneMatchVersion>
  <lib dir="${solr.install.dir:../../../..}/contrib/extraction/lib" 
   regex=".*\.jar" />
  <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-cell-\d.*
   \.jar" />
   <lib dir="${solr.install.dir:../../../..}/contrib/clustering/lib/" 
   regex=".*\.jar" />
  <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-
   clustering-\d.*\.jar" />
  <lib dir="${solr.install.dir:../../../..}/contrib/langid/lib/"  
   regex=".*\.jar" />
  <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-langid-\d.*
   \.jar" />
  <lib dir="${solr.install.dir:../../../..}/contrib/velocity/lib" 
   regex=".*\.jar" />
  <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-velocity-\d.*
   \.jar" />
  <dataDir>${solr.data.dir:}</dataDir>  
  <directoryFactory name="DirectoryFactory"                     
   class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/>  
  <codecFactory class="solr.SchemaCodecFactory"/>  
  <schemaFactory class="ClassicIndexSchemaFactory"/>
  <indexConfig>    
    <lockType>${solr.lock.type:native}</lockType>    
     <infoStream>true</infoStream>
  </indexConfig>
  <jmx />
  <updateHandler class="solr.DirectUpdateHandler2">
    <updateLog>
      <str name="dir">${solr.ulog.dir:}</str>
    </updateLog> 
     <autoCommit> 
       <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> 
       <openSearcher>false</openSearcher> 
     </autoCommit>
     <autoSoftCommit> 
       <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime> 
     </autoSoftCommit>
  </updateHandler>  
  <query>
    <maxBooleanClauses>1024</maxBooleanClauses>
    <slowQueryThresholdMillis>-1</slowQueryThresholdMillis>
    <filterCache class="solr.FastLRUCache"
                 size="512"
                 initialSize="512"
                 autowarmCount="0"/>
    <queryResultCache class="solr.LRUCache"
                     size="512"
                     initialSize="512"
                     autowarmCount="0"/>  
    <documentCache class="solr.LRUCache"
                   size="512"
                   initialSize="512"
                   autowarmCount="0"/>    
    <!-- custom cache currently used by block join --> 
    <cache name="perSegFilter"
      class="solr.search.LRUCache"
      size="10"
      initialSize="0"
      autowarmCount="10"
      regenerator="solr.NoOpRegenerator" />   
    <enableLazyFieldLoading>true</enableLazyFieldLoading>
   <queryResultWindowSize>20</queryResultWindowSize>
   <queryResultMaxDocsCached>200</queryResultMaxDocsCached>
    <listener event="newSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
      </arr>
    </listener>
    <listener event="firstSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
        <lst>
          <str name="q">static firstSearcher warming in solrconfig.xml</str>
        </lst>
      </arr>
    </listener>
    <useColdSearcher>false</useColdSearcher>
    <maxWarmingSearchers>2</maxWarmingSearchers>
  </query>
  <requestDispatcher handleSelect="false" > 
    <requestParsers enableRemoteStreaming="true" 
                    multipartUploadLimitInKB="2048000"
                    formdataUploadLimitInKB="2048"
                    addHttpRequestToContext="false"/>
    <httpCaching never304="true" />
  </requestDispatcher>
  <requestHandler name="/select" class="solr.SearchHandler" >    
     <lst name="defaults">
      <str name="echoParams">explicit</str>
      <int name="rows">10</int>
      <str name="field">Name</str>
      <str name="spellcheck">on</str>     
      <str name="spellcheck.collate">true</str>
      <str name="spellcheck.dictionary">default</str>
      <str name="spellcheck.dictionary">wordbreak</str>
      <str name="spellcheck.maxcollations">5</str> 
      <str name="spellcheck.count">5</str>
      <str name="spellcheck.alternativetermcount">2</str>
      <str name="spellcheck.maxresultsforsuggest">5</str>
      <str name="spellcheck.maxcollationtries">5</str>
     </lst>  
    <arr name="last-components">
      <str>spellcheck</str>
    </arr>    
 </requestHandler>
  <requestHandler name="/query" class="solr.SearchHandler">
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <str name="wt">json</str>
       <str name="indent">true</str>
       <str name="df">Name</str>
     </lst>
  </requestHandler>
   <requestHandler name="/export" class="solr.SearchHandler">
    <lst name="invariants">
      <str name="rq">{!xport}</str>
      <str name="wt">xsort</str>
      <str name="distrib">false</str>
    </lst>
    <arr name="components">
      <str>query</str>
    </arr>
  </requestHandler> 
  <initParams path="/update/**,/query,/select,/tvrh,/elevate,/spell">
    <lst name="defaults">
      <str name="df">Name</str>
    </lst>
  </initParams>
  <initParams path="/update/json/docs">
    <lst name="defaults">      
      <str name="srcField">_src_</str>
      <str name="mapUniqueKeyOnly">true</str>
    </lst>
  </initParams>
  <requestHandler name="/update/extract" 
                  startup="lazy"
                  class="solr.extraction.ExtractingRequestHandler" >
    <lst name="defaults">
      <str name="lowernames">true</str>
      <str name="uprefix">ignored_</str>
      <!-- capture link hrefs but ignore div attributes -->
      <str name="captureAttr">true</str>
      <str name="fmap.a">links</str>
      <str name="fmap.div">ignored_</str>
    </lst>
  </requestHandler>
    <requestHandler name="/analysis/field" 
                  startup="lazy"
                  class="solr.FieldAnalysisRequestHandler" />
    <requestHandler name="/analysis/document" 
                  class="solr.DocumentAnalysisRequestHandler" 
                  startup="lazy" />
    <!-- Echo the request contents back to the client -->
     <requestHandler name="/debug/dump" class="solr.DumpRequestHandler" >
    <lst name="defaults">
     <str name="echoParams">explicit</str> 
     <str name="echoHandler">true</str>
    </lst>
    </requestHandler>  
     <requestHandler name="/update" class="solr.UpdateRequestHandler">
    </requestHandler>                 
    <requestHandler name="/admin/ping" class="solr.PingRequestHandler">
    <lst name="invariants">
      <str name="qt">/select</str>
      <str name="q">*:*</str>
    </lst>
  </requestHandler>  
    <requestHandler name="/replication" class="solr.ReplicationHandler" >  
     </requestHandler>  
  <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
    <str name="queryAnalyzerFieldType">solr_query</str>
    <lst name="spellchecker">
      <str name="name">default</str>
      <str name="field">SpellContent</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <str name="distanceMeasure">internal</str>
      <float name="accuracy">0.5</float>
      <int name="maxEdits">2</int>
      <int name="minPrefix">1</int>
      <int name="maxInspections">5</int>
      <int name="minQueryLength">4</int>
      <float name="maxQueryFrequency">0.01</float>
    </lst>    
    <lst name="spellchecker">
      <str name="name">wordbreak</str>
      <str name="classname">solr.WordBreakSolrSpellChecker</str>      
      <str name="field">SpellContent</str>
      <str name="combineWords">true</str>
      <str name="breakWords">false</str>
      <int name="maxChanges">10</int>     
    </lst>
  </searchComponent>
  <requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <str name="df">Name</str>
      <str name="spellcheck.dictionary">default</str>
      <str name="spellcheck.dictionary">wordbreak</str>
      <str name="spellcheck">on</str>
      <str name="spellcheck.extendedResults">true</str>       
      <str name="spellcheck.count">10</str>
      <str name="spellcheck.alternativeTermCount">5</str>
      <str name="spellcheck.maxResultsForSuggest">5</str>       
      <str name="spellcheck.collate">true</str>
      <str name="spellcheck.collateExtendedResults">true</str>  
      <str name="spellcheck.maxCollationTries">10</str>
      <str name="spellcheck.maxCollations">5</str>         
    </lst>
    <arr name="last-components">
      <str>spellcheck</str>
    </arr>
  </requestHandler>  
  <searchComponent name="suggest" class="solr.SuggestComponent" 
                   enable="${solr.suggester.enabled:false}"     >
    <lst name="suggester">
      <str name="name">mySuggester</str>
      <str name="lookupImpl">FuzzyLookupFactory</str>      
      <str name="dictionaryImpl">DocumentDictionaryFactory</str>
      <str name="field">cat</str>
      <str name="weightField">price</str>
      <str name="suggestAnalyzerFieldType">string</str>
    </lst>
  </searchComponent>
  <requestHandler name="/suggest" class="solr.SearchHandler" 
                  startup="lazy" enable="${solr.suggester.enabled:false}" >
    <lst name="defaults">
      <str name="suggest">true</str>
      <str name="suggest.count">10</str>
    </lst>
    <arr name="components">
      <str>suggest</str>
    </arr>
  </requestHandler>
  <searchComponent name="tvComponent" class="solr.TermVectorComponent"/>
  <requestHandler name="/tvrh" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <bool name="tv">true</bool>
    </lst>
    <arr name="last-components">
      <str>tvComponent</str>
    </arr>
  </requestHandler>
  <searchComponent name="clustering"
                   enable="${solr.clustering.enabled:false}"
                   class="solr.clustering.ClusteringComponent" >
    <lst name="engine">
      <str name="name">lingo</str>
      <str name="carrot.algorithm">
     org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>
      <str name="carrot.resourcesDir">clustering/carrot2</str>
    </lst>
    <!-- An example definition for the STC clustering algorithm. -->
    <lst name="engine">
      <str name="name">stc</str>
      <str 
     name="carrot.algorithm">
    org.carrot2.clustering.stc.STCClusteringAlgorithm</str>
    </lst>
    <lst name="engine">
      <str name="name">kmeans</str>
      <str    name="carrot.algorithm">
    org.carrot2.clustering.kmeans.BisectingKMeansClusteri
    ngAlgorithm</str>
    </lst>
  </searchComponent>
  <requestHandler name="/clustering"
                  startup="lazy"
                  enable="${solr.clustering.enabled:false}"
                  class="solr.SearchHandler">
    <lst name="defaults">
      <bool name="clustering">true</bool>
      <bool name="clustering.results">true</bool>      
      <str name="carrot.title">name</str>     
      <str name="carrot.url">id</str>      
      <str name="carrot.snippet">features</str>
      <!-- Apply highlighter to the title/ content and use this for 
    clustering. -->
      <bool name="carrot.produceSummary">true</bool>
      <!-- the maximum number of labels per cluster -->
      <!--<int name="carrot.numDescriptions">5</int>-->
      <!-- produce sub clusters -->
      <bool name="carrot.outputSubClusters">false</bool>
      <!-- Configure the remaining request handler parameters. -->
      <str name="defType">edismax</str>
      <str name="qf">
        Name
      </str>
      <str name="q.alt">*:*</str>
      <str name="rows">10</str>
      <str name="fl">*,score</str>
    </lst>
    <arr name="last-components">
      <str>clustering</str>
    </arr>
  </requestHandler>
  <searchComponent name="terms" class="solr.TermsComponent"/>
  <!-- A request handler for demonstrating the terms component -->
  <requestHandler name="/terms" class="solr.SearchHandler" startup="lazy">
     <lst name="defaults">
      <bool name="terms">true</bool>
      <bool name="distrib">false</bool>
    </lst>     
    <arr name="components">
      <str>terms</str>
    </arr>
  </requestHandler>
  <searchComponent name="elevator" class="solr.QueryElevationComponent" >
    <!-- pick a fieldType to analyze queries -->
    <str name="queryFieldType">string</str>
    <str name="config-file">elevate.xml</str>
  </searchComponent>
  <!-- A request handler for demonstrating the elevator component -->
  <requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
      <str name="echoParams">explicit</str>
    </lst>
    <arr name="last-components">
      <str>elevator</str>
    </arr>
  </requestHandler> 
    <searchComponent class="solr.HighlightComponent" name="highlight">
    <highlighting>
      <!-- Configure the standard fragmenter -->
      <!-- This could most likely be commented out in the "default" case -->
      <fragmenter name="gap" 
                  default="true"
                  class="solr.highlight.GapFragmenter">
        <lst name="defaults">
          <int name="hl.fragsize">100</int>
        </lst>
      </fragmenter>

      <!-- A regular-expression-based fragmenter 
           (for sentence extraction) 
        -->
      <fragmenter name="regex" 
                  class="solr.highlight.RegexFragmenter">
        <lst name="defaults">
          <!-- slightly smaller fragsizes work better because of slop -->
          <int name="hl.fragsize">70</int>
          <!-- allow 50% slop on fragment sizes -->
          <float name="hl.regex.slop">0.5</float>
          <!-- a basic sentence pattern -->
          <str name="hl.regex.pattern">[-\w ,/\n\&quot;&apos;]{20,200}</str>
        </lst>
      </fragmenter>
      <!-- Configure the standard formatter -->
      <formatter name="html" 
                 default="true"
                 class="solr.highlight.HtmlFormatter">
        <lst name="defaults">
          <str name="hl.simple.pre"><![CDATA[<em>]]></str>
          <str name="hl.simple.post"><![CDATA[</em>]]></str>
        </lst>
      </formatter>
      <!-- Configure the standard encoder -->
      <encoder name="html" 
               class="solr.highlight.HtmlEncoder" />
      <!-- Configure the standard fragListBuilder -->
      <fragListBuilder name="simple" 
                       class="solr.highlight.SimpleFragListBuilder"/>      
      <!-- Configure the single fragListBuilder -->
      <fragListBuilder name="single" 
                       class="solr.highlight.SingleFragListBuilder"/>      
      <!-- Configure the weighted fragListBuilder -->
      <fragListBuilder name="weighted" 
                       default="true"
                       class="solr.highlight.WeightedFragListBuilder"/>      
      <!-- default tag FragmentsBuilder -->
      <fragmentsBuilder name="default" 
                        default="true"
                        class="solr.highlight.ScoreOrderFragmentsBuilder">            
        <!-- 
        <lst name="defaults">
          <str name="hl.multiValuedSeparatorChar">/</str>
        </lst>
        -->
      </fragmentsBuilder>
      <!-- multi-colored tag FragmentsBuilder -->
      <fragmentsBuilder name="colored" 
                        class="solr.highlight.ScoreOrderFragmentsBuilder">
        <lst name="defaults">
          <str name="hl.tag.pre"><![CDATA[
               <b style="background:yellow">,<b style="background:lawgreen">,
               <b style="background:aquamarine">,<b 
   style="background:magenta">,
               <b style="background:palegreen">,<b style="background:coral">,
               <b style="background:wheat">,<b style="background:khaki">,
               <b style="background:lime">,<b 
   style="background:deepskyblue">]]></str>
          <str name="hl.tag.post"><![CDATA[</b>]]></str>
        </lst>
      </fragmentsBuilder>      
      <boundaryScanner name="default" 
                       default="true"
                       class="solr.highlight.SimpleBoundaryScanner">
        <lst name="defaults">
          <str name="hl.bs.maxScan">10</str>
          <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
        </lst>
      </boundaryScanner>      
      <boundaryScanner name="breakIterator" 
                       class="solr.highlight.BreakIteratorBoundaryScanner">
        <lst name="defaults">
          <!-- type should be one of CHARACTER, WORD(default), LINE and  
   SENTENCE -->
          <str name="hl.bs.type">WORD</str>
          <!-- language and country are used when constructing Locale object.    
    -->
          <!-- And the Locale object will be used when getting instance of 
   BreakIterator -->
          <str name="hl.bs.language">en</str>
          <str name="hl.bs.country">US</str>
        </lst>
      </boundaryScanner>
    </highlighting>
  </searchComponent>
  <queryResponseWriter name="json" class="solr.JSONResponseWriter">
    <str name="content-type">text/plain; charset=UTF-8</str>
  </queryResponseWriter>  
    <queryResponseWriter name="velocity" class="solr.VelocityResponseWriter" 
   startup="lazy">
      <str name="template.base.dir">${velocity.template.base.dir:}</str>
    </queryResponseWriter>  
  <queryResponseWriter name="xslt" class="solr.XSLTResponseWriter">
    <int name="xsltCacheLifetimeSeconds">5</int>
  </queryResponseWriter>
  <admin>
    <defaultQuery>*:*</defaultQuery>
  </admin>
   </config>

我尝试了简单的Solr查询但是没有得到任何正确的结果以及下面的过滤器

NGramFilterFactory

WordDelimiterFilterFactory

所以如果我错过了什么,请告诉我,

谢谢,

1 个答案:

答案 0 :(得分:0)

您遇到的问题是架构中的“名称”字段的类型为“text”。 您应该将类​​型更正为“solr_query”。

<field name="Name" type="solr_query" indexed="true" stored="true"  
 required="false" />

请记住,在架构更改后,您必须重新加载核心或重新启动Solr,然后重新加载整个集合数据。

再次,

<copyField source="Name" dest="solr_query"/>

此配置行似乎没用,因为它将字段名称的内容复制到字段solr_query中。但据我所知,你没有这样的字段solr_query。

您只声明了一个名为solr_query的类型。