使用以下架构(定义如下)。我可以使用map reduce来聚合所有日期的deliver_count字段(这是广告系列文档中的嵌入式数组)。
{
campaign_id: 1,
status: 'running',
dates: {
'20130926' => {
delivered: 1,
failed: 1,
queued: 1,
clicked: 1,
males_count: 1,
females_count: 1,
pacific_region: { clicked_count: 10 },
america_region: { clicked_count: 10 },
atlantic_region: { clicked_count: 10 },
europe_region: { clicked_count: 10 },
africa_region: { clicked_count: 10 },
etc_region: { clicked_count: 10 },
asia_region: { clicked_count: 10 },
australia_region: { clicked_count: 10 }
},
'20130927' => {
delivered: 1,
failed: 1,
queued: 1,
clicked: 1,
males_count: 1,
females_count: 1,
pacific_region: { clicked_count: 10 },
america_region: { clicked_count: 10 },
atlantic_region: { clicked_count: 10 },
europe_region: { clicked_count: 10 },
africa_region: { clicked_count: 10 },
etc_region: { clicked_count: 10 },
asia_region: { clicked_count: 10 },
australia_region: { clicked_count: 10 }
},
'20130928' => {
delivered: 1,
failed: 1,
queued: 1,
clicked: 1,
males_count: 1,
females_count: 1,
pacific_region: { clicked_count: 10 },
america_region: { clicked_count: 10 },
atlantic_region: { clicked_count: 10 },
europe_region: { clicked_count: 10 },
africa_region: { clicked_count: 10 },
etc_region: { clicked_count: 10 },
asia_region: { clicked_count: 10 },
australia_region: { clicked_count: 10 }
}
}
}
以下代码通过字段asia_regions
解析输出字段clicked_count
=>的值30(所有数据的组合值)
$rethinkdb.table(:daily_stat_campaigns).filter { |daily_stat_campaign| daily_stat_campaign[:campaign_id].eq 1 }[0][:dates].do { |doc|
doc.keys.map { |key|
doc.get_field(key)[:asia_region][:clicked_count].default(0)
}.reduce { |left, right|
left+right
}
}.run
是否可以运行上面的代码但是针对多个区域?这样我就可以运行一个返回多个总和的查询。我想要实现的输出类似于下面的伪结果。
[{ asia_region: {clicked_count: 30}}, {america_region: {clicked_count: 30} }]
答案 0 :(得分:1)
我对你发布的代码感到有点困惑。为什么一切都在filter
之内?要输出您想要的内容,请执行以下操作:
regions = [:pacific_region, :america_region, ...]
reg_clicks = r.table(:daily_stat_campaigns).concat_map { |row|
row[:dates]
.coerce_to("ARRAY")
.map{ |date| date[0] }
.pluck(regions)
.coerce_to("ARRAY")
}
您现在可以运行reg_clicks,它应该如下所示:
$ reg_clicks.run()
[[:asia_region, {clicked_count: 30}], [:etc_region, {clicked_count: 30}], ...]
现在我们需要进行最后一次转换来聚合它:
$ aggregate = reg_clicks.map{ |reg|
{reg: reg[0], clicked_count: reg[0][:clicked_count]}
}
.group_by(:reg, r.sum(:clicked_count))
这将为您提供如下输出:
[{group: :asia_region, reduction: 150} ...]
如果您希望它看起来与您想要的完全一样,那么您可以应用最终转换:
aggregate.map{ |row|
[row[:group], row[:reduction]]
}
.coerce_to("OBJECT")
如果您稍微规范化数据,这些查询肯定会更好一些。将事情分解为另外两个表:date和:region_clicks,看起来像这样:
#dates
{
id: 0
campaign_id: 1
date: '20130927'
delivered: 1,
failed: 1,
queued: 1,
clicked: 1,
males_count: 1
}
#region_clicks
{
region: "asia_region",
click_count: 30,
date_id: 0
}
然后您的查询将如下:
r.table(:region_clicks).group_by(:region, r.sum(:click_count)).run()
答案 1 :(得分:1)
这似乎有效:
require 'awesome_print' # For better readability on output
regions = [:pacific_region, :america_region]
reg_clicks = $rethinkdb.table(:daily_stat_campaigns).filter { |daily_stat_campaign| daily_stat_campaign[:campaign_id].eq 1 }[0][:dates].do { |doc|
doc.keys.concat_map { |key|
doc
.get_field(key)
.pluck(regions)
.coerce_to("ARRAY")
}
}
ap reg_clicks.run
将输出类似:[["america_region", {"clicked_count"=>10}], ["pacific_region", {"clicked_count"=>10}], ["america_region", {"clicked_count"=>10}], ["pacific_region", {"clicked_count"=>10}], ["america_region", {"clicked_count"=>10}], ["pacific_region", {"clicked_count"=>10}]]
aggregate = reg_clicks.map { |reg|
{ reg: reg[0], clicked_count: reg[1][:clicked_count] }
}
ap aggregate.run
将输出:[{"reg"=>"america_region", "clicked_count"=>10}, {"reg"=>"pacific_region", "clicked_count"=>10}, {"reg"=>"america_region", "clicked_count"=>10}, {"reg"=>"pacific_region", "clicked_count"=>10}, {"reg"=>"america_region", "clicked_count"=>10}, {"reg"=>"pacific_region", "clicked_count"=>10}]
ap aggregate.group_by(:reg, $rethinkdb_rql.sum(:clicked_count)).run
输出:[{"reduction"=>30, "group"=>{"reg"=>"america_region"}}, {"reduction"=>30, "group"=>{"reg"=>"pacific_region"}}]