通过elasticsearch.index,身体结构和映射向elasticsearch添加新文档

时间:2019-10-11 00:29:58

标签: elasticsearch elasticsearch-py

我正在使用flask(基于Miguel Grinberg Megatutorial)构建类似博客的应用,并且尝试设置支持自动完成功能的ES索引。我正在努力正确设置索引。

我从(有效的)简单索引机制开始:

from flask import current_app

def add_to_index(index, model):
    if not current_app.elasticsearch:
        return
    payload = {}
    for field in model.__searchable__:
        payload[field] = getattr(model, field)
    current_app.elasticsearch.index(index=index, id=model.id, body=payload)

在和Google一起玩了一段时间之后,我发现我的身体看起来可能是这样的(可能是用了更少的分析仪,但是我正好按照我在某处发现的样子进行处理,作者声称它可以工作)

{
 "settings": {
"index": {
  "analysis": {
    "filter": {},
    "analyzer": {
      "keyword_analyzer": {
        "filter": [
          "lowercase",
          "asciifolding",
          "trim"
        ],
        "char_filter": [],
        "type": "custom",
        "tokenizer": "keyword"
      },
      "edge_ngram_analyzer": {
        "filter": [
          "lowercase"
        ],
        "tokenizer": "edge_ngram_tokenizer"
      },
      "edge_ngram_search_analyzer": {
        "tokenizer": "lowercase"
      }
    },
    "tokenizer": {
      "edge_ngram_tokenizer": {
        "type": "edge_ngram",
        "min_gram": 2,
        "max_gram": 5,
        "token_chars": [
          "letter"
        ]
      }
    }
  }
}
 },
"mappings": {
field: {
  "properties": {
    "name": {
      "type": "text",
      "fields": {
        "keywordstring": {
          "type": "text",
          "analyzer": "keyword_analyzer"
        },
        "edgengram": {
          "type": "text",
          "analyzer": "edge_ngram_analyzer",
          "search_analyzer": "edge_ngram_search_analyzer"
        },
        "completion": {
          "type": "completion"
        }
      },
      "analyzer": "standard"
    }
     }
   }
 }
}

我发现我可以将原始机制修改为:

    for field in model.__searchable__:
    temp = getattr(model, field)
    fields[field] = {"properties": {
      "type": "text",
      "fields": {
        "keywordstring": {
          "type": "text",
          "analyzer": "keyword_analyzer"
        },
        "edgengram": {
          "type": "text",
          "analyzer": "edge_ngram_analyzer",
          "search_analyzer": "edge_ngram_search_analyzer"
        },
        "completion": {
          "type": "completion"
        }
      },
      "analyzer": "standard"
    }}
payload = {
    "settings": {
        "index": {
          "analysis": {
            "filter": {},
            "analyzer": {
              "keyword_analyzer": {
                "filter": [
                  "lowercase",
                  "asciifolding",
                  "trim"
                ],
                "char_filter": [],
                "type": "custom",
                "tokenizer": "keyword"
              },
              "edge_ngram_analyzer": {
                "filter": [
                  "lowercase"
                ],
                "tokenizer": "edge_ngram_tokenizer"
              },
              "edge_ngram_search_analyzer": {
                "tokenizer": "lowercase"
              }
            },
            "tokenizer": {
              "edge_ngram_tokenizer": {
                "type": "edge_ngram",
                "min_gram": 2,
                "max_gram": 5,
                "token_chars": [
                  "letter"
                ]
              }
            }
          }
        }
    },
    "mappings": fields
}

但是那是我迷路的地方。我应该将实际内容(temp = getattr(model,field))放置在此文档中的什么位置,以便使整个工作正常进行?我找不到任何示例或文档的相关部分来涵盖使用稍微复杂一些的映射来更新索引等等,这是否正确/可行?我看到的每本指南都涵盖了批量索引编制以及无法建立连接的原因。

1 个答案:

答案 0 :(得分:1)

我认为您有点误解,让我尝试解释一下。您想要添加一个具有弹性的文档:

  

current_app.elasticsearch.index(index = index,id = model.id,   body =有效载荷)

正在使用elasticsearch-py lib中定义的index()方法 在此处查看示例: https://elasticsearch-py.readthedocs.io/en/master/index.html#example-usage 正文必须是您的文档的简单字典,如文档示例中所示。

您设置的是不同的索引设置。类似于数据库,您可以在文档内部设置表的架构。

如果要设置给定设置,则要设置设置,需要使用put_settings(如此处定义): https://elasticsearch-py.readthedocs.io/en/master/api.html?highlight=settings#elasticsearch.client.ClusterClient.put_settings

希望对您有帮助。