Question

目前，我们正在使用mongodb 1.2.2创建数据库并存储值。我们的数据类型如下所示：

"file" : "1" , "tools": { "foo": { "status": "pending"} }
"file" : "2" , "tools": { "bar": { "status": "pending" } }
"file" : "3" , "tools": { "foo": { "status": "running" } }
"file" : "4" , "tools": { "bar": { "status": "done" } }
"file" : "5" , "tools": { "foo": { "status": "done" } }

我们想要查询具有{ "status" : "pending" }.的每一个我们不想使用{"tools.foo.status" : "pending"}因为除了foo和bar之外我们会有很多不同的变体。为了更清楚，我们想要做这样的事情{"tools.*.status" : "pending"}

Answer 1

不，你不能这样做。我担心你必须维持自己的索引。也就是说，对于文件集合的每次插入/更新，请对file_status_index集合执行upsert以更新当前状态。

查询也是一个两步过程：首先查询索引集合以获取ID，然后向文件集合发出$in查询以获取实际数据。

这可能听起来很可怕，但这是你必须用这个架构付出的代价。

Answer 2

首先，您应该升级MongoDB。 1.2.2实际上是旧版本。

其次，你不能查询你的问题。您可以使用Map/Reduce.

执行此操作

Answer 3

我认为现在是时候问为什么你按照自己的方式存储事物了。

没有有效的方法来搜索这种结构;由于没有已知的仅限密钥路径来获取您要过滤的价值，因此每一条记录都需要每次都进行扩展，这非常昂贵，尤其是一旦您的收集不再适合RAM。

IMO，您最好使用二级收藏来保持这些状态。是的，它会使您的数据存储更具关系性，但这是因为您的数据是关系。

file_tools:
  { 'file_id' : 1, 'name' : 'foo', 'status' : 'pending' }
  { 'file_id' : 2, 'name' : 'bar', 'status' : 'pending' }
  { 'file_id' : 3, 'name' : 'foo', 'status' : 'running' }
  { 'file_id' : 4, 'name' : 'foo', 'status' : 'done' }
  { 'file_id' : 5, 'name' : 'foo', 'status' : 'done' }


files:
  { 'id': 1 }
  { 'id': 2 }
  { 'id': 3 }
  { 'id': 4 }
  { 'id': 5 }

> // find out which files have pending tools
> files_with_pending_tools = file_tools.find( { 'status' : 'pending' }, { 'file_id' : 1 } )
> //=> [ { 'file_id' : 1 }, { 'file_id' : 2 } ]
> 
> // just get the ids
> file_ids_with_pending_tools = files_with_pending_tools.map( function( file_tool ){
>    file_tool['file_id']
> })
> //=> [1,2]
> 
> // query the files
> files.find({'id': { $in : file_ids_with_pending_tools }})
> //=> [ { 'id' : 1 }, { 'id' : 2 } ]

Mongodb：通过多个相对未知的键查询值

3 个答案: