ElasticSearch&附件类型(NEST C#)

时间:2013-02-08 21:55:06

标签: elasticsearch attachment nest

我正在尝试使用elasticsearch / NEST索引pdf文档。

文件已编入索引,但搜索结果返回0次点击。

我需要搜索结果只返回文档ID和突出显示结果

(没有base64内容)

以下是代码:

我会感激任何帮助,

谢谢,

class Program
{
    static void Main(string[] args)
    {
        // create es client
        string index = "myindex";

        var settings = new ConnectionSettings("localhost", 9200)
            .SetDefaultIndex(index);
        var es = new ElasticClient(settings);

        // delete index if any
        es.DeleteIndex(index);

        // index document
        string path = "test.pdf";
        var doc = new Document()
        {
            Id = 1,
            Title = "test",
            Content = Convert.ToBase64String(File.ReadAllBytes(path))
        };

        var parameters = new IndexParameters() { Refresh = true };
        if (es.Index<Document>(doc, parameters).OK)
        {
            // search in document
            string query = "semantic"; // test.pdf contains the string "semantic"

            var result = es.Search<Document>(s => s
                .Query(q =>
                    q.QueryString(qs => qs
                        .Query(query)
                    )
                )
                .Highlight(h => h
                    .PreTags("<b>")
                    .PostTags("</b>")
                    .OnFields(
                      f => f
                        .OnField(e => e.Content)
                        .PreTags("<em>")
                        .PostTags("</em>")
                    )
                )
            );

            if (result.Hits.Total == 0)
            {
            }
        }
    }
}

[ElasticType(
    Name = "document",
    SearchAnalyzer = "standard",
    IndexAnalyzer = "standard"
)]
public class Document
{
    public int Id { get; set; }

    [ElasticProperty(Store = true)]
    public string Title { get; set; }

    [ElasticProperty(Type = FieldType.attachment,
        TermVector = TermVectorOption.with_positions_offsets)]
    public string Content { get; set; }
}

4 个答案:

答案 0 :(得分:8)

安装附件插件并重新启动ES

bin/plugin -install elasticsearch/elasticsearch-mapper-attachments/2.3.2

创建一个映射到附件插件文档的附件类

  public class Attachment
  {
      [ElasticProperty(Name = "_content")]
      public string Content { get; set; }

      [ElasticProperty(Name = "_content_type")]
      public string ContentType { get; set; }

      [ElasticProperty(Name = "_name")]
      public string Name { get; set; }
  }

在要编入索引的Document类上添加一个名为“File”的属性,并正确映射

  [ElasticProperty(Type = FieldType.Attachment, TermVector = TermVectorOption.WithPositionsOffsets, Store = true)]
  public Attachment File { get; set; }

在索引类的任何实例之前,显式创建索引。如果不这样做,它将使用动态映射并忽略您的属性映射。如果您以后更改映射,请始终重新创建索引。

  client.CreateIndex("index-name", c => c
     .AddMapping<Document>(m => m.MapFromAttributes())
  );

索引您的商品

  string path = "test.pdf";

  var attachment = new Attachment();
  attachment.Content = Convert.ToBase64String(File.ReadAllBytes(path));
  attachment.ContentType = "application/pdf";
  attachment.Name = "test.pdf";

  var doc = new Document()
  {
      Id = 1,
      Title = "test",
      File = attachment
  };
  client.Index<Document>(item);

搜索文件属性

  var query = Query<Document>.Term("file", "searchTerm");

  var searchResults = client.Search<Document>(s => s
          .From(start)
          .Size(count)
          .Query(query)
  );

答案 1 :(得分:1)

//我正在使用FSRiver插件 - https://github.com/dadoonet/fsriver/

void Main()
{
    // search in document
    string query = "directly"; // test.pdf contains the string "directly"
    var es = new ElasticClient(new ConnectionSettings( new Uri("http://*.*.*.*:9200"))
        .SetDefaultIndex("mydocs")
        .MapDefaultTypeNames(s=>s.Add(typeof(Doc), "doc")));
        var result = es.Search<Doc>(s => s
        .Fields(f => f.Title, f => f.Name)
        .From(0)
        .Size(10000)
            .Query(q => q.QueryString(qs => qs.Query(query)))
            .Highlight(h => h
                .PreTags("<b>")
                .PostTags("</b>")
                .OnFields(
                  f => f
                    .OnField(e => e.File)
                    .PreTags("<em>")
                    .PostTags("</em>")
                )
            )
        );
}

[ElasticType(Name = "doc",  SearchAnalyzer = "standard", IndexAnalyzer = "standard")]
public class Doc
{
    public int Id { get; set; }

     [ElasticProperty(Store = true)]
     public string Title { get; set; }

    [ElasticProperty(Type = FieldType.attachment, TermVector = TermVectorOption.with_positions_offsets)]
    public string File { get; set; }
    public string Name { get; set; }
}

答案 2 :(得分:0)

我正在努力,所以现在我正在尝试这个 http://www.elasticsearch.cn/tutorials/2011/07/18/attachment-type-in-action.html

本文解释了问题

请注意您应该做正确的映射

 "title" : { "store" : "yes" },
 "file" : { "term_vector":"with_positions_offsets", "store":"yes" }

我将尝试弄清楚如何使用NEST api并更新此帖子

答案 3 :(得分:-1)

在索引项目之前,您需要添加如下所示的映射。

client.CreateIndex("yourindex", c => c.NumberOfReplicas(0).NumberOfShards(12).AddMapping<AssetSearchEntryModels>(m => m.MapFromAttributes()));