Elasticsearch Custom Analyzer

时间:2016-09-15 20:21:49

标签: json curl elasticsearch analysis

我目前正在使用Scrapy作为Crawler和Elasticsearch作为服务器进行搜索引擎实施。 Scrapy和Elasticsearch工作正常,但我目前正在努力实现的是使用德语分析器进行不区分大小写的搜索。我有一个通用结构(match_all查询),如下所示:

"hits": {
    "total": 14,
    "max_score": 0.40951526,
    "hits": [
        {
            "_index": "uni",
            "_type": "items",
            "_id": "AVcuHuT6qni1Wq78foIA",
            "_score": 0.40951526,
            "_source": {
                "description": "...",
                "tags": [
                    "...",
                    "..."
                ],
                "url":"...",
                "author": "...",
                "content": "...",
                "date": "18.09.2015",
                "title": "..."
            },
            "highlight": {
                "content": [
                    "...",
                    "...",
                    "..."
                ]
            }
        }
    ]
}

并尝试添加这些设置 " curl -XPUT localhost:9200 / uni {...}":

{
    "mappings":{
        "_source":{
            "type":"object",
            "properties":{
                "title":{
                    "type":"string",
                    "analyzer":"german_lowercase"
                },
                "content":{
                    "type":"string",
                    "analyzer":"german_lowercase"
                },
                "description":{
                    "type":"string",
                    "analyzer":"german_lowercase"
                },
                "tags":{
                    "type":"array",
                    "analyzer":"german_lowercase"
                }
            }
        }
    },
    "settings":{
        "uni":{
            "analysis":{
                "analyzer":{
                    "german_lowercase":{
                        "type":"custom",
                        "tokenizer":"keyword",
                        "filter":[
                            "lowercase",
                            "german_stop",
                            "german_keywords",
                            "german_normalization",
                            "german_stemmer"
                        ]
                    }
                },
                "filter":{
                    "german_stop": {
                        "type": "stop",
                        "stopwords": "_german_"
                    },
                    "german_keywords": {
                        "type": "keyword_marker",
                        "keywords": []
                    },
                    "german_stemmer": {
                        "type": "stemmer",
                        "language": "light_german"
                    }
                }
            }
        }
    }
}

我不确定哪里出错了,有人可以帮忙吗?

编辑: Elasticsearch不允许我将这些设置放入索引(已经存在),如果我尝试单独放置映射,我会得到一个"缺少映射类型"例外。如果设置失败,则更新非动态设置。所以我要求更一般的信息,我应该如何更新这些设置/映射,以便启用不区分大小写的搜索(其他帖子有相同的问题)。

0 个答案:

没有答案