Question

如何减少/聚合表格中的多个字段？这看起来效率不高：

r.object(
  'favorite_count', 
  r.db('twitterdb').table('tweets').map(tweet => tweet('favorite_count')).avg().round(),
  'retweet_count',
   r.db('twitterdb').table('tweets').map(tweet => tweet('retweet_count')).avg().round()
  )

（预期）结果：

{

    "favorite_count": 17 ,
    "retweet_count": 156

}

Answer 1

我不确定是否可以使用内置函数使RethinkDB一次性工作，但您可以自己轻松实现avg函数：

r.db('twitterdb')
  .table('tweets')
  .fold(
    {_: 0, favorite_count: 0, retweet_count: 0},
    (a, tweet) => ({
      _: a('_').add(1),
      favorite_count: a('favorite_count').add(tweet('favorite_count')),
      retweet_count: a('retweet_count').add(tweet('retweet_count'))
    })
  )
  .do(a => ({
    favorite_count: r.branch(a('_').gt(0), a('favorite_count').div(a('_')).round(), null),
    retweet_count: r.branch(a('_').gt(0), a('retweet_count').div(a('_')).round(), null)
  }))

我已经通过一小组数据快速测试了上述内容，并且启用查询分析显示至少/ 2次分片访问和更少的执行时间。但是我不确定整体的探查器输出，我认为我不能解释它的细节（我相信本地avg更优化，但看起来更便宜，至少在两轮中访问数据）。此外，这个自定义 avg 函数实现更多0元素友好而不会抛出错误。

Answer 2

如果已知阵列的长度（例如7430），则更快：

  r.db('twitterdb').table('tweets')
  .reduce((agg, item) => {
    return {
      favorite_count: agg('favorite_count').add(item('favorite_count')),
      retweet_count: agg('retweet_count').add(item('retweet_count'))
      }
    })
   .do(result => r.object('favorite_count', result('favorite_count').div(7430).round(), 'retweet_count', result('retweet_count').div(7430).round()))

在一个rethinkdb查询中聚合多个字段

2 个答案: