用于分面搜索的弹性搜索聚合,不包括某些字段

时间:2017-07-07 10:40:17

标签: elasticsearch filter aggregation faceted-search

我有使用elasticsearch 2.4进行分面搜索的商店。 但目前现有的过滤器(产品属性)是从mysql中获取的。我想使用elasticsearch聚合来做到这一点。 但是我遇到了问题:我不需要聚合所有属性。

有什么:

映射的一部分:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
        <title>Taschenrechner</title>
        <link href="Stylesheets/Stylesheets.css" rel="stylesheet" type="text/css">
</head>
<body>
<?php
if(isset($_POST['zahl'])){
    $zahl1 = $_POST['zahl'];
}else{
    $zahl1 = 0;
}
?>

<form name="rechner" action="rechner.php" method="post">
    <input id="feld1" type="hidden" name="zahl1" value="">
    <input id="feld2" type="hidden" name="opera" value="">
    <input id="feld3" type="hidden" name="zahl2" value="">
    <input id="feld4" type="hidden" name="zwischerg" value=>
    <table align="center" style="width:300px; height:450px; border:solid thick black;">
        <tr>
            <td align="center">
                <input id="display" type="text" name="bildschirm" readonly="readonly" style="text-align: center; height: 50px; width: 216px;" value=<?php echo $zahl1; $op; $zahl2 ?>>
            </td>
        </tr>
        <tr>
            <td>
                <table align="center">
                    <tr>
                        <td>
                            <input class="taste" type="submit" name="zahl" value="1" >
                        </td>
                        <td>
                            <input class="taste" type="submit" name="zahl" value="2">
                        </td>
                        <td>
                            <input class="taste" type="submit" name="zahl" value="3">
                        </td>
                        <td>
                            <input class="taste" type="submit" name="operator" value="+">
                        </td>
                    </tr>
                    <tr>
                        <td>
                            <input class="taste" type="submit" name="zahl" value="4">
                        </td>
                        <td>
                            <input class="taste" type="submit" name="zahl" value="5">
                        </td>
                        <td >
                            <input class="taste" type="submit" name="zahl" value="6">
                        </td>
                        <td>
                            <input class="taste" type="submit" name="operator" value="-">
                        </td>
                    </tr>
                    <tr>
                        <td>
                            <input class="taste" type="submit" name="zahl" value="7">
                        </td>
                        <td>
                            <input class="taste" type="submit" name="zahl" value="8">
                        </td>
                        <td>
                            <input class="taste" type="submit" name="zahl" value="9">
                        </td>
                        <td>
                            <input class="taste" type="submit" name="operator" value="*">
                        </td>
                    </tr>
                    <tr>
                        <td>
                            <input class="taste" type="submit" name="clear" value="C">
                        </td>
                        <td>
                            <input class="taste" type="submit" name="zahl" value="0">
                        </td>
                        <td>
                            <input class="taste" type="submit" name="zahl" value=".">
                        </td>
                        <td>
                            <input class="taste" type="submit" name="operator" value="/">
                        </td>
                    </tr>
                    <tr>
                        <td colspan="4">
                            <input type="submit" name="gleich" value="=" style=" width: 215px; height: 50px;">
                        </td>
                    </tr>
                </table>
            </td>
        </tr>
    </table>
</form>
</body>
</html>

数据的例子:

...
'is_active' => [
    'type' => 'long',
    'index' => 'not_analyzed',
],
'category_id' => [
    'type' => 'long',
    'index' => 'not_analyzed',
],
'attrs' => [
    'properties' => [
        'attr_name' => ['type' => 'string', 'index'     => 'not_analyzed'],
        'value' => [
            'type' => 'string',
            'index' => 'analyzed',
            'analyzer' => 'attrs_analizer',
        ],
    ]
],
...

属性,例如&#34;模型&#34;和&#34;其他&#34;过滤产品时不使用它们,它们仅显示在产品页面上。在其他属性(品牌,操作系统和其他......)上,我希望收到聚合。

当我尝试聚合{ "id": 1, "is_active": "1", "category_id": 189, ... "price": "48.00", "attrs": [ { "attr_name": "Brand", "value": "TP-Link" }, { "attr_name": "Model", "value": "TL-1" }, { "attr_name": "Other", "value": "<div>Some text of 'Other' property<br><img src......><ul><li>......</ul></div>" } ] }, { "id": 2, "is_active": "1", "category_id": 242, ... "price": "12.00", "attrs": [ { "attr_name": "Brand", "value": "Lenovo" }, { "attr_name": "Model", "value": "B570" }, { "attr_name": "OS", "value": "Linux" }, { "attr_name": "Other", "value": "<div>Some text of 'Other' property<br><img src......><ul><li>......</ul></div>" } ] }, { "id": 3, "is_active": "1", "category_id": 242, ... "price": "24.00", "attrs": [ { "attr_name": "Brand", "value": "Asus" }, { "attr_name": "Model", "value": "QZ85" }, { "attr_name": "OS", "value": "Windows" }, { "attr_name": "Other", "value": "<div>Some text of 'Other' property<br><img src......><ul><li>......</ul></div>" } ] } 字段时,我当然会得到所有数据的聚合(包括大&#34;其他&#34;字段,其中可能有很多HTML)。< / p>

attrs.value

如何排除"aggs": { "facet_value": { "terms": { "field": "attrs.value", "size": 0 } } }

更改映射对我来说是一个糟糕的解决方案,但如果不可避免,请告诉我该怎么做?我想我需要制作&#34; attrs&#34;拼图?

UPD:

我想收到:  1.产品在某个类别中的所有属性,除了我在我的系统设置中指定的属性(在此示例中,我将排除&#34;模型&#34;和&#34;其他&#34 )。  2.每个值附近的产品数量。

它应该是这样的:

对于类别&#34;笔记本电脑&#34;:

品牌:

  • 联想(18)
  • 华硕(19)
  • .....

OS:

  • Windows(19)
  • Linux(5)
  • ...

对于&#34;电脑显示器&#34;:

品牌:

  • 三星(18)
  • LG(19)
  • .....

分辨率:

  • 1360x768(19)
  • 1920x1080(22)
  • ....

条款汇总,我将其用于每个类别的产品数量。我尝试"attrs.attr_name": ["Model", "Other"],但我不知道如何排除&#34; attrs.value&#34;,它们指的是&#34; attrs.value&amp; attrs.attr_name": "Model"

UPD2:

在我的情况下,如果map attrs为嵌套类型,则索引的权重增加30%。 从2700Mi到3510Mi。 如果没有其他选择,我将不得不忍受它。

1 个答案:

答案 0 :(得分:0)

您必须将第一个attrs映射为nested type并使用nested aggregations

PUT no_play
{
  "mappings": {
    "document_type" : {
      "properties": {
        "is_active" : {
          "type": "long"
        },
        "category_id" : {
          "type": "long"
        },
        "attrs" : {
          "type": "nested", 
          "properties": {
            "attr_name" : {
              "type" : "keyword"
            },
            "value" : {
              "type" : "keyword"
            }
          }
        }
      }
    }
  }
}


POST no_play/document_type
  {
    "id": 3,
    "is_active": "1",
    "category_id": 242,
    "price": "24.00",
    "attrs": [
      {
        "attr_name": "Brand",
        "value": "Asus"
      },
      {
        "attr_name": "Model",
        "value": "QZ85"
      },
      {
        "attr_name": "OS",
        "value": "<div>Some text of 'Other' property<br><img src......><ul><li>......</ul></div>"
      },
      {
        "attr_name": "Other",
        "value": "<div>Some text of 'Other' property<br><img src......><ul><li>......</ul></div>"
      }
    ]
  }

因为你没有提到你想要聚合的方式。

案例1)如果您想将attrs算作个人。此指标为您提供术语出现次数。

POST no_play/_search
{
  "size": 0,
  "aggs": {
    "nested_aggregation_value": {
      "nested": {
        "path": "attrs"
      },
      "aggs": {
        "value_term": {
          "terms": {
            "field": "attrs.value",
            "size": 10
          }
        }
      }
    }
  }
}

POST no_play/_search
    {
      "size": 0,
      "aggs": {
        "nested_aggregation_value": {
          "nested": {
            "path": "attrs"
          },
          "aggs": {
            "value_term": {
              "terms": {
                "field": "attrs.value",
                "size": 10
              },
              "aggs": {
                "reverse_back_to_roots": {
                  "reverse_nested": {
                  }
                }
              }
            }
          }
        }
      }
    }

现在要计算具有attrs值的根文档,您需要挂钩reverse nested aggregation以将聚合器的级别提升到根文档的级别。

请考虑以下文档。

{
    "id": 3,
    "is_active": "1",
    "category_id": 242,
    "price": "24.00",
    "attrs": [
      {
        "attr_name": "Brand",
        "value": "Asus"
      },
      {
        "attr_name": "Model",
        "value": "QZ85"
      },
      {
        "attr_name": "OS",
        "value": "repeated value"
      },
      {
        "attr_name": "Other",
        "value": "repeated value"
      }
    ]
  }

首次查询“重复值”的值计数&#39;将是2,对于第二个查询,它将是1

注意

以下是如何进行过滤排除

的方法
POST no_play/_search
{
    "size": 0,
    "aggs": {
        "nested_aggregation_value": {
            "nested": {
                "path": "attrs"
            },
            "aggs": {
                "filtered_results": {
                    "filter": {
                        "bool": {
                            "must_not": [{
                                "terms": {
                                    "attrs.attr_name": ["Model", "Brand"]
                                }
                            }]
                        }
                    },
                    "aggs": {
                        "value_term": {
                            "terms": {
                                "field": "attrs.value",
                                "size": 10
                            }
                        }
                    }
                }
            }
        }
    }
}


POST no_play/_search
 {
    "size": 0,
    "aggs": {
        "nested_aggregation_value": {
            "nested": {
                "path": "attrs"
            },
            "aggs": {
                "filtered_results": {
                    "filter": {
                        "bool": {
                            "must_not": [{
                                "terms": {
                                    "attrs.attr_name": ["Model", "Brand"]
                                }
                            }]
                        }
                    },
                    "aggs": {
                        "value_term": {
                            "terms": {
                                "field": "attrs.value",
                                "size": 10
                            },
                            "aggs": {
                                "reverse_back_to_roots": {
                                    "reverse_nested": {}
                                }
                            }
                        }
                    }
                }
            }
        }
    }
 }

由于