Question

我的要求是通过弹性搜索为一个产品占领多个市场。

index：评分indextest 类型：productchild

每个产品将有多个市场，每个市场将有多个区域，每个区域将有多个国家，我将其表示为

{
  "ratingsindextest": {
    "mappings": {
      "productchild": {
        "_routing": {
          "required": true

             "productId": {
            "type": "string",
            "index": "not_analyzed"
          },
          "productMarkets": {
            "type": "nested",
            "properties": {
              "lobID": {
                "type": "string",
                "index": "not_analyzed"
              },
              "lobValue": {
                "type": "string",
                "index": "not_analyzed"
              },
              "regions": {
                "type": "nested",
                "properties": {
                  "countries": {
                    "type": "nested",
                    "properties": {
                      "country": {
                        "type": "string",
                        "index": "not_analyzed"
                      },
                      "status": {
                        "type": "string",
                        "index": "not_analyzed"
                      }
                    }
                  },
                  "regionId": {
                    "type": "string",
                    "index": "not_analyzed"
                  },
                  "regionName": {
                    "type": "string",
                    "index": "not_analyzed"
                  }
                }
              }
            }
          },
          "productName": {
            "type": "string",
            "analyzer": "english"
          },
          "shortDescription": {
            "type": "string",
            "analyzer": "english"
          },
          "supplierId": {
            "type": "string",
            "index": "not_analyzed"
          }
        }
      }
    }
  }
}

我的问题是，由于如上所述创建了多级嵌套字段，会不会有任何性能开销？或者建议以容纳以下普通列表的方式更改映射

{
  "ratingsindextest": {
    "mappings": {
      "productchild": {
        "_routing": {
          "required": true
        },
        "properties": {
          "productId": {
            "type": "string",
            "index": "not_analyzed"
          },
          "productMarkets": {
            "type": "nested",
            "properties": {
              "lobID": {
                "type": "string",
                "index": "not_analyzed"
              },
              "lobValue": {
                "type": "string",
                "index": "not_analyzed"
              },
              "region": {
                "type": "string",
                "index": "not_analyzed"
              },
              "countryId": {
                "type": "string",
                "index": "not_analyzed"
              },
              "countryName": {
                "type": "string",
                "index": "not_analyzed"
              }
            }
          },
          "productName": {
            "type": "string",
            "analyzer": "english"
          }
        }
      }
    }
  }
}

上述方式将有多个嵌套的文档，这些嵌套文档的Lob相同，但区域多个，每个区域具有多个国家/地区。

对采用哪种方法的任何建议表示赞赏

Answer 1

如果您可以对数据进行非规范化，那总是最好的选择，但当然并非总是可能的。另外，嵌套越少越好-在嵌套搜索中花费更少的钱。

嵌套：


嵌套文档彼此存储在同一Lucene块中，这有助于提高读取/查询性能。读取嵌套文档比同等的父/子更快。

更新嵌套文档（父级或嵌套子级）中的单个字段会强制ES重新索引整个嵌套文档。对于大型嵌套文档而言，这可能会非常昂贵

“交叉引用”嵌套文档是不可能的。

最适合不经常更改的数据


父母/子女


子级与父级分开存储，但被路由到同一分片。因此，父母/孩子在   比嵌套读取/查询

由于ES会在内存中维护“连接”列表，因此父/子映射会增加一些内存开销

更新子文档不会影响父文档或其他任何子文档，这可以节省大型文档的大量索引

对“父/子”进行排序/评分可能很困难，因为“具有子/父”操作有时可能不透明


非正规化


您可以自己处理所有关系！

最灵活，最管理的开销

根据您的设置，性能可能会有所不同

引自：Managing Relations Inside Elasticsearch

弹性搜索中的多层嵌套对象与“多个嵌套对象”

1 个答案: