Solr - 意外的排序顺序和sortMissingLast

时间:2016-06-16 11:26:04

标签: sorting solr

我的text类型定义如下:

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory" />
        <filter class="solr.LowerCaseFilterFactory" />
    </analyzer>
    ...

使用给定类型的几个字段。其中一个字段是title字段,它始终定义且不丢失,对于任何文档也不为空。按此字段排序时,ascdesc Solr会以按给定顺序返回文档,但看似随机。只有在将sortMissingLast="true"添加到类型声明排序后才能正确排序。

有人可以向我解释为什么会这样吗?根据我的理解,sortMissingLast在使用sort时不应该生效,因为a)它与文档的插入有关b)我的集合中的所有文档都定义了这个字段。

进一步阅读:

If sortMissingLast="true", then a sort on this field will cause documents without the field to come after documents with the field, regardless of the requested sort order (asc or desc)

我确实有其他字段使用相同的text类型,但所有字段都存在。它们可能是空的,但它们存在于文档中。

1 个答案:

答案 0 :(得分:0)

我使用fieldType测试了一个示例索引,即text。当我尝试使用title asc时,我的结果低于响应

    "responseHeader":{
        "status":0,
        "QTime":1,
        "params":{
          "sort":"title asc",
          "indent":"true",
          "q":"*:*",
          "wt":"json"}},
      "response":{"numFound":10,"start":0,"docs":[
          {
            "id":["123"],
            "title":"awesome designs_takeaway",
            "lastmodified":"f75a2e26-cb41-4028-abb2-bcd7f61e4f9e",
            "_version_":1538551521252212736},
          {
            "id":["124"],
            "title":"breathtaking_designs takeaway",
            "lastmodified":"170b3857-d906-44df-950c-547c25b4e594",
            "_version_":1538551543494606848},
          {
            "id":["125"],
            "title":"curtain raiser",
            "lastmodified":"ea7149d5-449f-4d69-919b-617b90420381",
            "_version_":1538573292313509888},
          {
            "id":["126"],
            "title":"defying gravity_008",
            "lastmodified":"82844b75-24ba-4b2f-be20-9bb3fe83e6b1",
            "_version_":1538551590630195200},
          {
            "id":["127"],
            "title":"emancipation_of the poor",
            "lastmodified":"d19482a5-1666-4d4e-a40e-eb93c00eca7e",
            "_version_":1538551627310432256},
          {
            "id":["128"],
            "title":"functioning of the-metadata",
            "lastmodified":"7b07f281-1268-48cc-aee6-7a6636702ba5",
            "_version_":1538551653171462144},
          {
            "id":["130"],
            "title":"graphics enhancer 101",
            "lastmodified":"67fd79d6-2ae5-4597-b2e1-128bfd815b67",
            "_version_":1538551680471138304},
          {
            "id":["131"],
            "title":"half-hearted attempt",
            "lastmodified":"abb4707c-8392-4595-aaeb-fbf6d4f098b1",
            "_version_":1538551699761790976},
          {
            "id":["132"],
            "title":"INK jet corporation",
            "lastmodified":"b29ba3af-f3da-49d1-bd45-f7d277c53cff",
            "_version_":1538551727666495488},
          {
            "id":["136"],
            "title":"xamarin",
            "filecontent":"bolshevik",
            "lastmodified":"af8a1445-e693-4bac-9ac8-84fa2c9b838d",
            "_version_":1538571040880328704}]
      }}`

当我尝试使用title desc时。

我的回复

`{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "sort":"title desc",
      "indent":"true",
      "q":"*:*",
      "wt":"json"}},
  "response":{"numFound":10,"start":0,"docs":[
      {
        "id":["136"],
        "title":"xamarin",
        "filecontent":"bolshevik",
        "lastmodified":"af8a1445-e693-4bac-9ac8-84fa2c9b838d",
        "_version_":1538571040880328704},
      {
        "id":["132"],
        "title":"INK jet corporation",
        "lastmodified":"b29ba3af-f3da-49d1-bd45-f7d277c53cff",
        "_version_":1538551727666495488},
      {
        "id":["131"],
        "title":"half-hearted attempt",
        "lastmodified":"abb4707c-8392-4595-aaeb-fbf6d4f098b1",
        "_version_":1538551699761790976},
      {
        "id":["130"],
        "title":"graphics enhancer 101",
        "lastmodified":"67fd79d6-2ae5-4597-b2e1-128bfd815b67",
        "_version_":1538551680471138304},
      {
        "id":["128"],
        "title":"functioning of the-metadata",
        "lastmodified":"7b07f281-1268-48cc-aee6-7a6636702ba5",
        "_version_":1538551653171462144},
      {
        "id":["127"],
        "title":"emancipation_of the poor",
        "lastmodified":"d19482a5-1666-4d4e-a40e-eb93c00eca7e",
        "_version_":1538551627310432256},
      {
        "id":["126"],
        "title":"defying gravity_008",
        "lastmodified":"82844b75-24ba-4b2f-be20-9bb3fe83e6b1",
        "_version_":1538551590630195200},
      {
        "id":["125"],
        "title":"curtain raiser",
        "lastmodified":"ea7149d5-449f-4d69-919b-617b90420381",
        "_version_":1538573292313509888},
      {
        "id":["124"],
        "title":"breathtaking_designs takeaway",
        "lastmodified":"170b3857-d906-44df-950c-547c25b4e594",
        "_version_":1538551543494606848},
      {
        "id":["123"],
        "title":"awesome designs_takeaway",
        "lastmodified":"f75a2e26-cb41-4028-abb2-bcd7f61e4f9e",
        "_version_":1538551521252212736}]
  }}`

正如您所看到的,我得到了预期的结果。我用了Solr v 5.3.2。您的类型text也不会将文字标记为部分,因此是sorting的合适人选。所以没有用这种方式来解决问题。sortMissingLastsortMissingFirst参数完全用于不同的目的,虽然我用它们来复制你的观察结果,但我只看到了预期的结果。正如您所说,您的每个文档都有一个title字段,因此我在所有文档中都保留了title字段,因此没有使用sortMissingLastsortMissingFirst参数,因为它们将影响文档集中包含其中没有title字段的文档,您的结果不应与我得到的结果不同。这只是推断,可能是你的solr有一个bug。如果您没有使用与我相同的版本,请在版本5.3.2或与您不同的版本上试用一次文档。或者,如果您不怀疑Solr有错误,您是否可以提供您身边的标题子集,这些标题会被排序为错误,以便了解如何对其进行分析。如果有帮助,请告诉我:)。