在“内容”字段中搜索solr不起作用

时间:2016-05-01 08:21:20

标签: pdf select search solr field

我已经在solr 6.0.0中上传并解压缩了文档,我看到它是使用以下查询编制索引的:

http://localhost:8983/solr/techproducts/select?indent=on&q=id:doc1&wt=json

{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "q":"id:doc1",
      "indent":"on",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "links":["http://www.education.gov.yk.ca/"],
        "id":"doc1",
        "last_modified":"2008-06-04T22:47:36Z",
        "title":[" PDF Test Page"],
        "content_type":["application/pdf"],
        "author":"Yukon Canada Yukon Department of Education",
        "author_s":"Yukon Canada Yukon Department of Education",
        "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  PDF Test Page \n \n    \n  \n \nPDF Test File \n \nCongratulations, your computer is equipped with a PDF (Portable Document Format) \nreader!  You should be able to view any of the PDF documents and forms available on \nour site.  PDF forms are indicated by these icons:   or  .   \n \nYukon Department of Education \nBox 2703 \nWhitehorse,Yukon \nCanada \nY1A 2C6 \n \nPlease visit our website at:  http://www.education.gov.yk.ca/\n    \n  \n    \n \n  "],
        "_version_":1533049305513852928}]
  }}

我看到字段内容出现了多个单词PDF

当字段名称为content并且其中包含PDF时,为什么我没有获得以下查询的结果?:

select?q=*:*&fq=content:PDF

{
  "responseHeader":{
    "status":0,
    "QTime":4,
    "params":{
      "q":"*:*",
      "indent":"on",
      "fq":"content:PDF",
      "rows":"50",
      "wt":"json"}},
  "response":{"numFound":0,"start":0,"docs":[]
  }}

当我使用不同的字段查询时,例如title,那么我得到了正确的结果:

select?q=*:*&fq=title:PDF

{
  "responseHeader":{
    "status":0,
    "QTime":3,
    "params":{
      "q":"*:*",
      "indent":"on",
      "fq":"title:PDF",
      "rows":"50",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "links":["http://www.education.gov.yk.ca/"],
        "id":"doc1",
        "last_modified":"2008-06-04T22:47:36Z",
        "title":[" PDF Test Page"],
        "content_type":["application/pdf"],
        "author":"Yukon Canada Yukon Department of Education",
        "author_s":"Yukon Canada Yukon Department of Education",
        "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  PDF Test Page \n \n    \n  \n \nPDF Test File \n \nCongratulations, your computer is equipped with a PDF (Portable Document Format) \nreader!  You should be able to view any of the PDF documents and forms available on \nour site.  PDF forms are indicated by these icons:   or  .   \n \nYukon Department of Education \nBox 2703 \nWhitehorse,Yukon \nCanada \nY1A 2C6 \n \nPlease visit our website at:  http://www.education.gov.yk.ca/\n    \n  \n    \n \n  "],
        "_version_":1533049305513852928}]
  }}

1 个答案:

答案 0 :(得分:0)

检查schema.xml为内容field type定义的field

比较内容的字段类型和标题字段。

可能是您没有为字段内容定义正确的字段类型。这些字段类型不会为您的文本生成任何令牌,或者必须将整个文本视为一个。如果您为字段使用keywordtokenizerstring字段类型,则会发生这种情况。

您可以检查相同的检查或分析Solr调试工具。

您可以在此处查看文本的编制方式以及搜索文本的方式。

当您想要搜索field时,您必须提及属性indexed=true,并且您希望solr返回相同的值,然后您需要添加stored=true

这两个attribute可帮助您实现搜索并检索字段的原始值