我是Elasticsearch的新手。我能找到的所有示例,教程和问题似乎表明Elasticsearch要求文档被格式化为字段 - 值关系。是否可以搜索格式中没有特定键值关系的自由格式文本文档或日志?如果是这样,有人请指点我的示例/文档吗?
答案 0 :(得分:1)
是的,通过创建一个只包含一个包含所有自由格式文本的字段的文档,完全可以做到这一点。它会工作得非常好,因为ElasticSearch非常适合文本搜索。 您可以完全索引文档,例如:
{ text : "
An alpaca (Vicugna pacos) is a domesticated species of South American camelid. It resembles a small llama in appearance. There are two breeds of alpaca; the Suri alpaca and the Huacaya alpaca. Alpacas are kept in herds that graze on the level heights of the Andes of southern Peru, northern Bolivia, Ecuador, and northern Chile at an altitude of 3,500 m (11,500 ft) to 5,000 m (16,000 ft) above sea level, throughout the year.[1] Alpacas are considerably smaller than llamas, and unlike llamas, they were not bred to be beasts of burden, but were bred specifically for their fiber. Alpaca fiber is used for making knitted and woven items, similar to wool. These items include blankets, sweaters, hats, gloves, scarves, a wide variety of textiles and ponchos in South America, and sweaters, socks, coats and bedding in other parts of the world. The fiber comes in more than 52 natural colors as classified in Peru, 12 as classified in Australia and 16 as classified in the United States.[2]
"}
如果你这样做并且你知道你的所有文本都是英文或其他语言,你应该指定一个到ElasticSearch的映射,例如创建映射,索引和搜索:
PUT /freetext
{
"mappings": {
"properties" : {
"text" : {
"type" : "string",
"analyzer": "english"
}
}
}
}
PUT /freetext/text/alpaga
{
"text" : "alpaga are awesome"
}
GET /freetext/text/_search?q="alpaga"
您可能会对更具体的分析仪感兴趣,我认为这个很好:
但在此之前,你应该考虑一下你的数据,我相信你总能找到其他领域。例如,您可能希望使用特定字段(日期,IP,应用程序名称,信息类型(错误,警告,信息))索引日志。 大多数文本文档都列出了作者和日期。
通过分离这些特殊信息并告诉Elastic Search有哪些(不应分析的日期和文本),您将能够进行搜索,例如: