我的要求是在elasticsearch中存储特定的文档字段以进行索引。 例: 我的文件是
{
"name":"stev",
"age":26,
"salary":25000
}
这是我的文档,但我不想索引总文档。我想要只存储名称字段。 我创建了一个索引emp和写下映射,如下所示
"person" : {
"_all" : {"enabled" : false},
"properties" : {
"name" : {
"type" : "string", "store" : "yes"
}
}
}
查看索引文档时
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "test",
"_id": "AU1_p0xAq8r9iH00jFB_",
"_score": 1,
"_source": { }
}
,
{
"_index": "test",
"_type": "test",
"_id": "AU1_lMDCq8r9iH00jFB-",
"_score": 1,
"_source": { }
}
]
}
}
未生成名称字段,为什么? 任何人帮助我
答案 0 :(得分:1)
很难说你发布的内容有什么问题,但我可以举一个有效的例子。
默认情况下,Elasticsearch将索引您提供的任何源文档。每当它看到一个新的文档字段时,它将创建一个具有合理默认值的映射字段,并且它也将默认索引它们。如果要排除字段,可以在映射中为要排除的每个字段设置"index": "no"
和"store": "no"
。如果您希望该行为成为每个字段的默认行为,则可以使用"_default_"
属性指定不存储的字段(尽管我无法使其无法编制索引)。
您可能还需要停用"_source"
,并在搜索查询中使用"fields"
参数。
这是一个例子。索引定义如下所示:
PUT /test_index
{
"mappings": {
"person": {
"_all": {
"enabled": false
},
"_source": {
"enabled": false
},
"properties": {
"name": {
"type": "string",
"index": "analyzed",
"store": "yes"
},
"age": {
"type": "integer",
"index": "no",
"store": "no"
},
"salary": {
"type": "integer",
"index": "no",
"store": "no"
}
}
}
}
}
然后我可以使用bulk api添加一些文档:
POST /test_index/person/_bulk
{"index":{"_id":1}}
{"name":"stev","age":26,"salary":25000}
{"index":{"_id":2}}
{"name":"bob","age":30,"salary":28000}
{"index":{"_id":3}}
{"name":"joe","age":27,"salary":35000}
由于我禁用了"_source"
,因此简单查询只返回ID:
POST /test_index/_search
...
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "person",
"_id": "1",
"_score": 1
},
{
"_index": "test_index",
"_type": "person",
"_id": "2",
"_score": 1
},
{
"_index": "test_index",
"_type": "person",
"_id": "3",
"_score": 1
}
]
}
}
但如果我指定我想要"name"
字段,我会得到它:
POST /test_index/_search
{
"fields": [
"name"
]
}
...
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "person",
"_id": "1",
"_score": 1,
"fields": {
"name": [
"stev"
]
}
},
{
"_index": "test_index",
"_type": "person",
"_id": "2",
"_score": 1,
"fields": {
"name": [
"bob"
]
}
},
{
"_index": "test_index",
"_type": "person",
"_id": "3",
"_score": 1,
"fields": {
"name": [
"joe"
]
}
}
]
}
}
您可以通过运行来证明其他字段未存储:
POST /test_index/_search
{
"fields": [
"name", "age", "salary"
]
}
将返回相同的结果。我还可以通过运行此查询来证明"age"
字段未编入索引,如果已将"age"
编入索引,则会返回文档:
POST /test_index/_search
{
"fields": [
"name", "age"
],
"query": {
"term": {
"age": {
"value": 27
}
}
}
}
...
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
这是我用来玩这个的一堆代码。我想使用_default
映射和/或字段来处理此问题,而无需为每个字段指定设置。我能够在不存储数据方面使其工作,但每个字段仍然被编入索引。
http://sense.qbox.io/gist/d84967923d6c0757dba5f44240f47257ba2fbe50