Question

在RethinkDB中，我需要在两个表之间执行连接（表示has-and-belongs-to-many-many关系），然后对连接的结果进行排序。可能有数十万甚至数百万的结果，所以我需要有效地对它们进行排序。

理想情况下，我想将orderBy()与索引一起使用。但是orderBy() can only use an index when called on a table和.eqJoin() returns a stream or an array。

以下是我正在使用的查询示例。我希望得到具有特定主题的对话：

r.table('conversations_topics')
  .getAll('c64a00d3-1b02-4045-88e7-ac3b4fee478f', {index: 'topics_id'})
  .eqJoin('conversations_id', r.table('conversations'))
  .map(row => row('right'))
  .orderBy('createdAt')

当主题包含几千个对话时，此处使用的未编入索引的orderBy()开始变得无法接受，并且由于RethinkDB的数组大小限制，将完全破坏为100,000。此数据库中的主题很容易包含数十万甚至数百万个会话，因此这是不可接受的。

我只需要这个查询一次返回少量结果（比如25），但我需要按顺序排列这些结果，所以在排序之后我才能限制。有什么想法吗？

Answer 1

我认为另一种方法是删除conversations_topics并将主题数据嵌入conversations表。有了它，我们可以创建一个复合索引，然后在两者之间同时执行filter和order。

r.table('conversations').indexCreate('topicAndDate', function(doc) {
  return doc('topics')
    .map(function(topic) {
      return [topic, doc('createdAt')]
    })
    .coerceTo('array')
}, {multi: true})

然后你可以像这样的查询一样使用......

r.table('conversations').between([('c64a00d3-1b02-4045-88e7-ac3b4fee478f', r.minval], [('c64a00d3-1b02-4045-88e7-ac3b4fee478f', r.maxval], {index: 'topicAndDate'})
  .orderBy({index: r.desc('topicAndDate')})
  .limit(25)

这里的关键是我们对orderBy和between使用相同的索引。如果您知道时间范围，则可以通过在between命令中设置时间值而不是使用minval和maxval来加快速度。

希望它会更快。

有没有一种有效的方法来对RethinkDB中的连接结果进行排序？

1 个答案: