Question

问题描述

我想运行像这样的查询字符串，例如：

{"query": {
    "query_string" : {
        "fields" : ["description"],
        "query" : "illegal~"
        }
     }
 }

我有一个包含同义词的side synonyms.txt文件：

illegal, banned, criminal, illegitimate, illicit, irregular, outlawed, prohibited
otherWord, synonym1, synonym2...

我想找到具有这些同义词中的任何一个的所有元素。

我尝试了什么

首先，我想在我的ES数据库中索引这些同义词。

我尝试使用curl运行此查询：

curl -X PUT "https://instanceAdress.europe-west1.gcp.cloud.es.io:9243/app/kibana#/dev_tools/console/sources" -H 'Content-Type: application/json' -d' {
"settings": {
    "index" : {
        "analysis" : {
            "analyzer" : {
                "synonym" : {
                    "tokenizer" : "whitespace",
                    "filter" : ["synonym"]
                }
            },
            "filter" : {
                "synonym" : {
                    "type" : "synonym",
                    "synonyms_path" : "synonyms.txt"
                }
            }
        }
    }
}
}
'

但它不起作用{"statusCode":404,"error":"Not Found"}

然后我需要更改我的查询，以便它考虑到同义词，但我不知道如何。

所以我的问题是：

如何索引我的同义词？
如何更改查询以便查询所有同义词？
有没有办法在Python中索引它们？

使用Python Elasticsearch

获取查询的示例

es = Elasticsearch(
    ['fullAdress.europe-west1.gcp.cloud.es.io'],
    http_auth=('login', 'password'),
    scheme="https",
    port=9243,
)
es.get(index="sources", doc_type='rcp', id="301495")

Answer 1

您可以通过以下方式在Python中使用同义词进行索引：首先，创建令牌过滤器：

ERROR - [0/0] Some workers seem to have died and gunicorn did not restart them as expected

然后创建一个分析器：

synonyms_token_filter = token_filter(
  'synonyms_token_filter',     # Any name for the filter
  'synonym',                   # Synonym filter type
  synonyms=your_synonyms       # Synonyms mapping will be inlined
)

还有一个用于打包的软件包：https://github.com/agora-team/elasticsearch-synonyms

索引ElasticSearch Python中的同义词

问题描述

我尝试了什么

所以我的问题是：

使用Python Elasticsearch

1 个答案: