Elasticsearch - 尝试索引MS Word附件&在其中进行全文搜索

时间:2016-12-12 11:40:33

标签: elasticsearch elasticsearch-plugin

由于标题已经显示,我正在尝试索引MS Word文档并在其中进行全文搜索。

我见过几个例子,但我无法弄清楚我做错了什么。

相关守则:

[ElasticsearchType(Name = "AttachmentDocuments")]
public class Attachment
{
    [String(Name = "_content")]
    public string Content { get; set; }
    [String(Name = "_content_type")]
    public string ContentType { get; set; }
    [String(Name = "_name")]
    public string Name { get; set; }

    public Attachment(Task<File> file)
    {
        Content = file.Result.FileContent;
        ContentType = file.Result.FileType;
        Name = file.Result.FileName;
    }
}

&#34;内容&#34;上面的属性设置为&#34; file.Result.FileContent&#34;在构造函数中。 &#34;内容&#34; property是一个base64字符串。

public class Document
{
    [Number(Name = "Id")]
    public int Id { get; set; }
    [Attachment]
    public Attachment File { get; set; }
    public String Title { get; set; }
}

以下是将文档索引到elasticsearch数据库的方法。

    public void IndexDocument(Attachment attachmentDocument)
    {
        // Create the index if it does not already exist
        var indexExists = _client.IndexExists(new IndexExistsRequest(ElasticsearchIndexName));
        if (!indexExists.Exists)
        {
            var indexDescriptor =
                new CreateIndexDescriptor(new IndexName {Name = ElasticsearchIndexName}).Mappings(
                    ms => ms.Map<Document>(m => m.AutoMap()));
            _client.CreateIndex(indexDescriptor);
        }

        var doc = new Document()
        {
            Id = 1,
            Title = "Test",
            File = attachmentDocument
        };

        _client.Index(doc);
    }

根据上面的代码,文档被索引到正确的索引中(来自Elasticsearch主机的截图 - Searchly):

Searchly Screenshot

文件中的内容为:&#34; VCXCVXCVXCVXCVXVXCVXCV&#34;并且通过以下查询,我获得零点击回报:

        QueryContainer queryContainer = null;
        queryContainer |= new MatchQuery()
        {
            Field = "file",
            Query = "VCXCVXCVXCVXCVXVXCVXCV"
        };

        var searchResult =
            await _client.LowLevel.SearchAsync<string>(ApplicationsIndexName, "document", new SearchRequest()
            {
                From = 0,
                Size = 10,
                Query = queryContainer, 
                Aggregations = GetAggregations()
            });
如果有人能暗示我做错了什么或者应该调查一下,我会暗示吗?

在我的Elasticsearch数据库中提供映射的屏幕截图:

Elasticsearch - Mapping

1 个答案:

答案 0 :(得分:1)

因为你指的是错误的字段。字段应为file.content

 queryContainer |= new MatchQuery()
        {
            Field = "file.content",
            Query = "VCXCVXCVXCVXCVXVXCVXCV"
        };