我正在尝试使用elasticsearch / NEST索引pdf文档。
文件已编入索引,但搜索结果返回0次点击。
我需要搜索结果只返回文档ID和突出显示结果
(没有base64内容)
以下是代码:
我会感激任何帮助,
谢谢,
class Program
{
static void Main(string[] args)
{
// create es client
string index = "myindex";
var settings = new ConnectionSettings("localhost", 9200)
.SetDefaultIndex(index);
var es = new ElasticClient(settings);
// delete index if any
es.DeleteIndex(index);
// index document
string path = "test.pdf";
var doc = new Document()
{
Id = 1,
Title = "test",
Content = Convert.ToBase64String(File.ReadAllBytes(path))
};
var parameters = new IndexParameters() { Refresh = true };
if (es.Index<Document>(doc, parameters).OK)
{
// search in document
string query = "semantic"; // test.pdf contains the string "semantic"
var result = es.Search<Document>(s => s
.Query(q =>
q.QueryString(qs => qs
.Query(query)
)
)
.Highlight(h => h
.PreTags("<b>")
.PostTags("</b>")
.OnFields(
f => f
.OnField(e => e.Content)
.PreTags("<em>")
.PostTags("</em>")
)
)
);
if (result.Hits.Total == 0)
{
}
}
}
}
[ElasticType(
Name = "document",
SearchAnalyzer = "standard",
IndexAnalyzer = "standard"
)]
public class Document
{
public int Id { get; set; }
[ElasticProperty(Store = true)]
public string Title { get; set; }
[ElasticProperty(Type = FieldType.attachment,
TermVector = TermVectorOption.with_positions_offsets)]
public string Content { get; set; }
}
答案 0 :(得分:8)
安装附件插件并重新启动ES
bin/plugin -install elasticsearch/elasticsearch-mapper-attachments/2.3.2
创建一个映射到附件插件文档的附件类
public class Attachment
{
[ElasticProperty(Name = "_content")]
public string Content { get; set; }
[ElasticProperty(Name = "_content_type")]
public string ContentType { get; set; }
[ElasticProperty(Name = "_name")]
public string Name { get; set; }
}
在要编入索引的Document类上添加一个名为“File”的属性,并正确映射
[ElasticProperty(Type = FieldType.Attachment, TermVector = TermVectorOption.WithPositionsOffsets, Store = true)]
public Attachment File { get; set; }
在索引类的任何实例之前,显式创建索引。如果不这样做,它将使用动态映射并忽略您的属性映射。如果您以后更改映射,请始终重新创建索引。
client.CreateIndex("index-name", c => c
.AddMapping<Document>(m => m.MapFromAttributes())
);
索引您的商品
string path = "test.pdf";
var attachment = new Attachment();
attachment.Content = Convert.ToBase64String(File.ReadAllBytes(path));
attachment.ContentType = "application/pdf";
attachment.Name = "test.pdf";
var doc = new Document()
{
Id = 1,
Title = "test",
File = attachment
};
client.Index<Document>(item);
搜索文件属性
var query = Query<Document>.Term("file", "searchTerm");
var searchResults = client.Search<Document>(s => s
.From(start)
.Size(count)
.Query(query)
);
答案 1 :(得分:1)
//我正在使用FSRiver插件 - https://github.com/dadoonet/fsriver/
void Main()
{
// search in document
string query = "directly"; // test.pdf contains the string "directly"
var es = new ElasticClient(new ConnectionSettings( new Uri("http://*.*.*.*:9200"))
.SetDefaultIndex("mydocs")
.MapDefaultTypeNames(s=>s.Add(typeof(Doc), "doc")));
var result = es.Search<Doc>(s => s
.Fields(f => f.Title, f => f.Name)
.From(0)
.Size(10000)
.Query(q => q.QueryString(qs => qs.Query(query)))
.Highlight(h => h
.PreTags("<b>")
.PostTags("</b>")
.OnFields(
f => f
.OnField(e => e.File)
.PreTags("<em>")
.PostTags("</em>")
)
)
);
}
[ElasticType(Name = "doc", SearchAnalyzer = "standard", IndexAnalyzer = "standard")]
public class Doc
{
public int Id { get; set; }
[ElasticProperty(Store = true)]
public string Title { get; set; }
[ElasticProperty(Type = FieldType.attachment, TermVector = TermVectorOption.with_positions_offsets)]
public string File { get; set; }
public string Name { get; set; }
}
答案 2 :(得分:0)
我正在努力,所以现在我正在尝试这个 http://www.elasticsearch.cn/tutorials/2011/07/18/attachment-type-in-action.html
本文解释了问题
请注意您应该做正确的映射
"title" : { "store" : "yes" },
"file" : { "term_vector":"with_positions_offsets", "store":"yes" }
我将尝试弄清楚如何使用NEST api并更新此帖子
答案 3 :(得分:-1)
在索引项目之前,您需要添加如下所示的映射。
client.CreateIndex("yourindex", c => c.NumberOfReplicas(0).NumberOfShards(12).AddMapping<AssetSearchEntryModels>(m => m.MapFromAttributes()));