解决方法1：

我们可以通过更改公司设置来达到上述要求，如下所示

PUT company/_settings
{
  "index.mapping.total_fields.limit": 2000
}

然后将客户的映射设置为：

PUT /company/_mapping/customer
{
   "customer": {
      "properties": {
         "property1": {
            "type": "text"
         },
         "property2": {
            "type": "text"
         },
         .
         .
         .
         "property2000": {
            "type": "text"
         }
      }
   }
}

问题：

以上解决方案会导致数据稀疏性问题，因为每个客户都没有所有属性。

溶液2：

我们可以为客户属性创建一个单独的_type（比如 custom_props ）并使用以下映射

PUT /company/_mapping/custom_props
{
   "custom_props": {
      "_parent": {
         "type": "customer"
      },
      "_routing": {
         "required": true
      },
      "properties": {
         "property_name": {
            "type": "text"
         },
         "property_value": {
            "type": "text"
         }
      }
   }
}

现在，客户的每个属性都在custom_props中有一个单独的文档。

问题：

在搜索具有某些属性的特定客户时，我们需要在某些用例中进行has_child查询和一些时间has_child查询与inner_hits。根据ES文档，这些查询比简单的搜索查询慢得多。

所以当我们有这个问题时，我想要一个解决这个问题的最佳选择我们的elasticsearch _index中有很多字段。

Answer 1

Elasticsearch中有一种handling relations您没有考虑过：nested objects。它们与父/子相似，但usually has better query performance。

对于嵌套数据类型，映射可能如下所示：

PUT /company/
{
  "mappings": {
    "customer": {
      "properties": {
        "commonProperty": {
          "type": "text"
        },
        "customerSpecific": {
          "type": "nested",
          "properties": {
            "property_name": {
              "type": "keyword"
            },
            "property_value": {
              "type": "text"
            }
          }
        }
      }
    }
  }
}

让我们看一下文档的样子：

POST /company/customer/1
{
  "commonProperty": "companyID1",
  "customerSpecific": [
    {
      "property_name": "name",
      "property_value": "John Doe"
    },
    {
      "property_name": "address",
      "property_value": "Simple Rd, 112"
    }
  ]
}

POST /company/customer/2
{
  "commonProperty": "companyID1",
  "customerSpecific": [
    {
      "property_name": "name",
      "property_value": "Jane Adams"
    },
    {
      "property_name": "address",
      "property_value": "42 St., 15"
    }
  ]
}

为了能够查询此类数据，我们必须使用nested query。例如，要查找名为"John"的客户，我们可能会使用如下查询：

POST /company/customer/_search
{
  "query": {
    "nested": {
      "path": "customerSpecific",
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "customerSpecific.property_name": "name"
              }
            },
            {
              "match": {
                "customerSpecific.property_value": "John"
              }
            }
          ]
        }
      }
    }
  }
}

希望有所帮助！

弹性搜索文档中太多字段的替代

解决方法1：

问题：

溶液2：

问题：

1 个答案: