我尝试使用Elasticsearch索引构建应用程序。 我有几个"内部"可以包含二进制数据(主要是PDF)的字段,我正在寻找定义管道和映射的最佳方法,给出以下事实:
所有字段和内容可以以多种语言(法语和英语)和多个字段提供
我必须能够查询给定语言和/或给定字段的内容。
这就是我到目前为止定义映射的方式:
{
"WfNewsEvent": {
"properties": {
"title": {
"type": "object",
"properties": {
"en": {
"type": "string"
},
"fr": {
"type": "string",
"analyzer": "french",
"search_analyzer": "french_search"
}
}
},
...
"extfile": {
"type": "object",
"properties": {
"title": {
"type": "object",
"properties": {
"en": {
"type": "string"
},
"fr": {
"type": "string",
"analyzer": "french",
"search_analyzer": "french_search"
}
}
},
"description": {
"type": "object",
"properties": {
"en": {
"type": "string"
},
"fr": {
"type": "string",
"analyzer": "french",
"search_analyzer": "french_search"
}
}
},
"data": {
"type": "object",
"properties": {
"en": {
"type": "attachment"
},
"fr": {
"type": "attachment",
"analyzer": "french",
"search_analyzer": "french_search"
}
}
}
}
},
"gallery": {
"type": "object",
"properties": {
"title": {
"type": "object",
"properties": {
"en": {
"type": "string"
},
"fr": {
"type": "string",
"analyzer": "french",
"search_analyzer": "french_search"
}
}
},
"description": {
"type": "object",
"properties": {
"en": {
"type": "string"
},
"fr": {
"type": "string",
"analyzer": "french",
"search_analyzer": "french_search"
}
}
},
"data": {
"type": "object",
"properties": {
"en": {
"type": "attachment"
},
"fr": {
"type": "attachment",
"analyzer": "french",
"search_analyzer": "french_search"
}
}
}
}
}
}
}
}
然后我的附件'管道定义:
{
"description" : "Extract attachment information",
"processors" : [
{
"attachment" : {
"field" : "extfile.data.en",
"ignore_missing": true
}
},
{
"attachment" : {
"field" : "extfile.data.fr",
"ignore_missing": true
}
},
{
"attachment" : {
"field" : "gallery.data.fr",
"ignore_missing": true
}
},
{
"attachment" : {
"field" : "gallery.data.fr",
"ignore_missing": true
}
}
]
}
实际上,当我尝试索引文档时,ES会引发一个例外,说明"数据"不是整数。所以任何帮助都会受到欢迎!
祝你好运, 亨利