为摄取附件内部字段定义Elasticsearch映射

时间:2017-05-24 09:45:16

标签: elasticsearch mapping attachment

我尝试使用Elasticsearch索引构建应用程序。 我有几个"内部"可以包含二进制数据(主要是PDF)的字段,我正在寻找定义管道和映射的最佳方法,给出以下事实:

  • 所有字段和内容可以以多种语言(法语和英语)和多个字段提供

  • 我必须能够查询给定语言和/或给定字段的内容。

这就是我到目前为止定义映射的方式:

{
    "WfNewsEvent": {
        "properties": {
            "title": {
                "type": "object",
                "properties": {
                    "en": {
                        "type": "string"
                    },
                    "fr": {
                        "type": "string",
                        "analyzer": "french",
                        "search_analyzer": "french_search"
                    }
                }
            },
            ...
            "extfile": {
                "type": "object",
                "properties": {
                    "title": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "string"
                            },
                            "fr": {
                                "type": "string",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    },
                    "description": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "string"
                            },
                            "fr": {
                                "type": "string",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    },
                    "data": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "attachment"
                            },
                            "fr": {
                                "type": "attachment",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    }
                }
            },
            "gallery": {
                "type": "object",
                "properties": {
                    "title": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "string"
                            },
                            "fr": {
                                "type": "string",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    },
                    "description": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "string"
                            },
                            "fr": {
                                "type": "string",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    },
                    "data": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "attachment"
                            },
                            "fr": {
                                "type": "attachment",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    }
                }
            }
        }
    }
}

然后我的附件'管道定义:

{
  "description" : "Extract attachment information",
  "processors" : [
    {
      "attachment" : {
        "field" : "extfile.data.en",
        "ignore_missing": true
      }
    },
    {
      "attachment" : {
        "field" : "extfile.data.fr",
        "ignore_missing": true
      }
    },
    {
      "attachment" : {
        "field" : "gallery.data.fr",
        "ignore_missing": true
      }
    },
    {
      "attachment" : {
        "field" : "gallery.data.fr",
        "ignore_missing": true
      }
    }
  ]
}

实际上,当我尝试索引文档时,ES会引发一个例外,说明"数据"不是整数。所以任何帮助都会受到欢迎!

祝你好运, 亨利

0 个答案:

没有答案