Solr 7.1.0 Spellcheck每个单词?

时间:2017-11-22 14:23:51

标签: solr spell-checking

/选择Q = Spannish

按预期返回拼写检查。

{
  "responseHeader":{
    "status":0,
    "QTime":5,
    "params":{
      "q":"Spannish",
      "fq":"auctionId:71",
      "_":"1511301093854"}},
  "response":{"numFound":0,"start":0,"docs":[]
  },
  "spellcheck":{
    "suggestions":[
      "spannish",{
        "numFound":5,
        "startOffset":0,
        "endOffset":8,
        "origFreq":0,
        "suggestion":[{
            "word":"spanish",
            "freq":343},
          {
[...]
    "correctlySpelled":false,
    "collations":[
      "collation",{
        "collationQuery":"spanish",
        "hits":5,
        "misspellingsAndCorrections":[
          "spannish","spanish"]},
[...]

然而,

/ select?q =西班牙步枪

返回" Rifle"的结果并认为查询拼写正确。

{
  "responseHeader":{
    "status":0,
    "QTime":0,
    "params":{
      "q":"Spannish Rifle",
      "fq":"auctionId:71",
      "_":"1511301093854"}},
    "response":{"numFound":839,"start":0,"docs":[
      {
[...]

我的拼写检查请求处理程序具有以下默认值

  <str name="spellcheck.onlyMorePopular">true</str>
  <str name="spellcheck.dictionary">default</str>
  <str name="spellcheck">on</str>
  <str name="spellcheck.extendedResults">true</str>
  <str name="spellcheck.count">10</str>
  <str name="spellcheck.alternativeTermCount">5</str>
  <str name="spellcheck.maxResultsForSuggest">5</str>
  <str name="spellcheck.collate">true</str>
  <str name="spellcheck.collateExtendedResults">true</str>
  <str name="spellcheck.maxCollationTries">10</str>
  <str name="spellcheck.maxCollations">5</str>

这是搜索组件

<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
  <str name="queryAnalyzerFieldType">text_ws</str>
  <lst name="spellchecker">
    <str name="classname">solr.IndexBasedSpellChecker</str>
    <str name="spellcheckIndexDir">./spellchecker</str>
    <str name="field">_spelling_</str>
    <str name="buildOnCommit">true</str>
    <str name="distanceMeasure">org.apache.lucene.search.spell.LevensteinDistance</str>
    <str name="accuracy">0.65</str>
  </lst>
</searchComponent>

_spelling_使用字段类型text_ws

<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
  <analyzer>
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
  </analyzer>
</fieldType>
<field name="_spelling_" type="text_ws" multiValued="true" indexed="true" stored="false"/>

我怎样才能让solr告诉我&#34;西班牙语&#34;拼写不正确

感谢。

0 个答案:

没有答案