在C#中使用fscrawler和NEST搜索doc的ElasticSearch文件映射

时间:2017-04-18 17:26:54

标签: c# wpf elasticsearch nest

使用fscrawler 2.3-SNAPSHOT索引文件夹“/ tmp / es”中的文档。它将它们映射为:

{
  "properties" : {
    "attachment" : {
      "type" : "binary",
      "doc_values": false
    },
    "attributes" : {
      "properties" : {
        "group" : {
          "type" : "keyword"
        },
        "owner" : {
          "type" : "keyword"
        }
      }
    },
    "content" : {
      "type" : "text"
    },
    "file" : {
      "properties" : {
        "content_type" : {
          "type" : "keyword"
        },
        "filename" : {
          "type" : "keyword"
        },
        "extension" : {
          "type" : "keyword"
        },
        "filesize" : {
          "type" : "long"
        },
        "indexed_chars" : {
          "type" : "long"
        },
        "indexing_date" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "last_modified" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "checksum": {
          "type": "keyword"
        },
        "url" : {
          "type" : "keyword",
          "index" : true
        }
      }
    },
    "object" : {
      "type" : "object"
    },
    "meta" : {
      "properties" : {
        "author" : {
          "type" : "text"
        },
        "date" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "keywords" : {
          "type" : "text"
        },
        "title" : {
          "type" : "text"
        },
        "language" : {
          "type" : "keyword"
        }
      }
    },
    "path" : {
      "properties" : {
        "encoded" : {
          "type" : "keyword"
        },
        "real" : {
          "type" : "keyword",
          "fields": {
            "tree": {
              "type" : "text",
              "analyzer": "fscrawler_path",
              "fielddata": true
            }
          }
        },
        "root" : {
          "type" : "keyword"
        },
        "virtual" : {
          "type" : "keyword",
          "fields": {
            "tree": {
              "type" : "text",
              "analyzer": "fscrawler_path",
              "fielddata": true
            }
          }
        }
      }
    }
  }
}

现在,我想在我的C#应用​​程序中使用NEST搜索它们,我能够通过hit.source.content获取内容,但无法通过hit.source.filename获取文件名... < / p>

代码:

 var response = elasticClient.Search<documents>(s => s
                .Index("tanks")
                .Type("doc")
                .Query(q => q.QueryString(qs => qs.Query(query))));

            if (rtxSearchResult.Text != " ")
            {
                rtxSearchResult.Text = " ";

                foreach (var hit in response.Hits)
                {


                    rtxSearchResult.Text = rtxSearchResult.Text + ("Name: " + hit.Source.fileName.ToString()
                    + Environment.NewLine
                    + "Content: " + hit.Source.content.ToString()
                    + Environment.NewLine
                    + "URL: " + hit.Source.url.ToString()
                    + Environment.NewLine
                    + Environment.NewLine);
                }
            }

上面会抛出NULLException,但在我对hit.Source.urlhit.Source.filename进行评论时会运行。

Kibana将文件名字段显示为file.filename,将网址显示为file.url,将内容显示为content

由于文件名嵌套在文件下,我无法检索它...请帮助在这里停留几天..

1 个答案:

答案 0 :(得分:0)

发现错误:

我的文件课是:

Class documents
{
      Public string filename { get; set; }

      Public string content { get; set; }

      Public string url { get; set; }
}

由于文件名和网址分别为file.filenamefile.url,我们需要另一个包含文件名和网址的类文件。

Class documents
{
      Public File file { get; set; }

      Public string content { get; set; }

}

Class File
{
          Public string filename { get; set; }

          Public string url { get; set; }
}

因此,我可以通过hit.Source.file.filenamehit.Source.file.url访问它们。