Question

我试图编写最优的查询来查找不具有特定字段的所有文档。有没有比我下面列出的例子更好的方法呢？

// Get the ids of all documents missing "location"
r.db("mydb").table("mytable").filter({location: null},{default: true}).pluck("id")

// Get a count of all documents missing "location"
r.db("mydb").table("mytable").filter({location: null},{default: true}).count()

现在，这些查询在一个包含~40k文档的表上大约需要300-400ms，这看起来相当慢。此外，在这种特定情况下，＆＃34;位置＆＃34;属性包含纬度/经度并具有地理空间索引。

有没有办法实现这个目标？谢谢！

Answer 1

天真的建议

您可以使用hasFields方法和not方法来过滤掉不需要的文档：

r.db("mydb").table("mytable")
  .filter(function (row) {
    return row.hasFields({ location: true }).not()
  })

这可能会或可能不会更快，但值得尝试。

使用辅助索引

理想情况下，您需要一种方法使location成为辅助索引，然后使用getAll或between，因为使用索引的查询总是更快。您可以解决的一种方法是，如果表中的所有行都没有位置，那么它们的位置值将为false。然后，您将为位置创建二级索引。最后，您可以根据需要使用getAll查询表格！

将位置属性添加到没有位置的所有字段

为此，您需要先将location: false插入到没有位置的所有行中。你可以这样做：

r.db("mydb").table("mytable")
  .filter(function (row) {
    return row.hasFields({ location: true }).not()
  })
  .update({
    location: false
  })

在此之后，每次添加没有位置的文档时，都需要找到插入location: false的方法。

为表格创建二级索引

现在所有文档都有location字段，我们可以为location创建辅助索引。

r.db("mydb").table("mytable")
 .indexCreate('location')

请注意，您只需添加{ location: false }并仅创建一次索引。

使用getAll

现在我们可以使用getAll使用location索引查询文档。

r.db("mydb").table("mytable")
 .getAll(false, { index: 'location' })

这可能会比上面的查询更快。

使用辅助索引（功能）

您还可以创建secondary index as a function。基本上，您创建一个函数，然后使用getAll查询该函数的结果。这可能比我之前提出的更容易，更直接。

创建索引

这是：

r.db("mydb").table("mytable")
 .indexCreate('has_location', 
   function(x) { return x.hasFields('location'); 
 })

使用getAll。

这是：

r.db("mydb").table("mytable")
 .getAll(false, { index: 'has_location' })

RethinkDB - 查找缺少字段的文档

1 个答案: