Question

在SOLR 6.0.0版中，可以在将响应返回给客户端之前进一步过滤分面结果吗？我的用例使用了facet＆＃34; sum＆＃34;用于计算每个客户的总存款。到目前为止一切都很好。

我想过滤facet响应中返回的行，只显示那些已经存入超过某个阈值的客户。我似乎找不到办法做到这一点。有可能吗？

我试图避免在从SOLR服务器返回后处理响应。我想知道是否有可能在服务器端完成此任务。我试图这样做的原因是服务器端的数据集可能非常大。如果我在客户端执行此操作，我可能需要使用＆＃39;限制＆＃39;进行多次分段搜索。和＆＃39;抵消＆＃39;用于查找阈值位置的参数。一旦发现继续使用用例。

参考咨询： http://yonik.com/json-facet-api/

以下是我设置环境的方法。

使用以下数据

创建名为entry.csv的csv文件

id,name_s,date_dt,flow_s,amount_f
1,John,2016-01-01T00:00:00Z,Deposit,10
2,Mary,2016-01-15T00:00:00Z,Deposit,20
3,Peter,2016-01-19T00:00:00Z,Deposit,30
4,John,2016-01-20T00:00:00Z,Deposit,40
5,Mary,2016-01-22T00:00:00Z,Deposit,50
6,Mary,2016-01-23T00:00:00Z,Deposit,60

启动SOLR服务器
```
$ bin/solr start
```
创建一个名为simple的新核心。
```
$ bin/solr create -c simple
```
导入数据
```
$ bin/post -c simple ~/entry.csv
```

查询数据以验证导入是否成功

$ curl -s http://localhost:8983/solr/simple/select?indent=on&q=*:*&wt=json
{
  "responseHeader":{
    "status":0,
    "QTime":0,
    "params":{
      "q":"*:*",
      "indent":"on",
      "wt":"json"}},
  "response":{"numFound":6,"start":0,"docs":[
      {
        "id":"1",
        "name_s":"John",
        "date_dt":"2016-01-01T00:00:00Z",
        "flow_s":"Deposit",
        "amount_f":10.0,
        "_version_":1532465926194069504},
      {
        "id":"2",
        "name_s":"Mary",
        "date_dt":"2016-01-15T00:00:00Z",
        "flow_s":"Deposit",
        "amount_f":20.0,
        "_version_":1532465926248595456},
      {
        "id":"3",
        "name_s":"Peter",
        "date_dt":"2016-01-19T00:00:00Z",
        "flow_s":"Deposit",
        "amount_f":30.0,
        "_version_":1532465926250692608},
      {
        "id":"4",
        "name_s":"John",
        "date_dt":"2016-01-20T00:00:00Z",
        "flow_s":"Deposit",
        "amount_f":40.0,
        "_version_":1532465926252789760},
      {
        "id":"5",
        "name_s":"Mary",
        "date_dt":"2016-01-22T00:00:00Z",
        "flow_s":"Deposit",
        "amount_f":50.0,
        "_version_":1532465926253838336},
      {
me_s":"Mary",
        "date_dt":"2016-01-23T00:00:00Z",
        "flow_s":"Deposit",
        "amount_f":60.0,
        "_version_":1532465926255935488}]
  }}

使用方面查询数据，显示每个客户的总存款金额。

$ curl -s http://localhost:8983/solr/simple/select -d 'q=*:*&rows=0&
json.facet={
  customers:{
   type:terms,
   field:name_s,
   sort:{gross:desc},
   facet:{
     gross:"sum(amount_f)"
   }
  }
}

<?xml version="1.0" encoding="UTF-8"?>
<response>
  <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">1</int>
    <lst name="params">
      <str name="q">*:*</str>
      <str name="json.facet">{   customers:{    type:terms,    field:name_s,    sort:{gross:desc},    facet:{      gross:"sum(amount_
      <str name="rows">0</str>
    </lst>
  </lst>
  <result name="response" numFound="6" start="0"/>
  <lst name="facets">
    <int name="count">6</int>
    <lst name="customers">
      <arr name="buckets">
        <lst>
          <str name="val">Mary</str>
          <int name="count">3</int>
          <double name="gross">130.0</double>
        </lst>
        <lst>
          <str name="val">John</str>
          <int name="count">2</int>
          <double name="gross">50.0</double>
        </lst>
        <lst>
          <str name="val">Peter</str>
          <int name="count">1</int>
          <double name="gross">30.0</double>
        </lst>
      </arr>
    </lst>
  </lst>
</response>

我想实现的目标

我想只过滤那些存入超过100美元的客户。这意味着在回复中，我希望只看到总存款为130的玛丽。我不想看到约翰或彼得回来。

Answer 1

在Apache SOLR云的6.3.0版中，有一个内置的“/ sql”处理程序。请参阅下面URL中的apache维基页面，了解处理程序的详细信息。

https://cwiki.apache.org/confluence/display/solr/Parallel+SQL+Interface#ParallelSQLInterface-/sqlRequestHandler

要获得问题＃7的结果，可以提交以下查询，结果只会显示名称和总汇总额（如果它们超过100）

stmt=SELECT name_s, sum(amount_f) as total
FROM simple
GROUP BY name_s
HAVING total > 100
ORDER BY total desc

在我的本地环境中，SOLR云托管在ip 10.0.0.40和端口8983上，我创建了一个集合名称“simple”。

curl --data-urlencode 'stmt=SELECT name_s, sum(amount_f) as total FROM simple GROUP BY name_s HAVING total>100 ORDER BY total desc' http://10.0.0.40:8983/solr/simple/sql


{"result-set":{"docs":[
{"name_s":"Mary","total":130.0},
{"EOF":true,"RESPONSE_TIME":7}]}}

使用sum函数的SOLR分面结果是否可以在服务器

1 个答案: