使用rmongodb将mongoDB中的数组中的值提取到数据帧

时间:2012-10-11 12:28:35

标签: arrays r mongodb rmongodb

我正在查询包含示例中显示的条目的数据库。所有条目都包含以下值:

  • _idoverallitemplaced_items
  • 的唯一ID
  • name:te overallitem
  • 的名称
  • locoverallitemplaced_items
  • 的位置
  • time_id:存储overallitem的时间
  • placed_items:包含placed_items的数组(范围从零开始:placed_items : [],到无限数量。
  • category_idplaced_items
  • 的类别
  • full_idplaced_items
  • 的完整ID

我希望在给定namefull_id约束的情况下,在category_id级别上提取placed_itemstime_idloc

示例数据:

{
 "_id" : "5040",
 "name" : "entry1",
 "loc" : 1,
 "time_id" : 20121001,
 "placed_items" : [],
}
{
 "_id" : "5041",
 "name" : "entry2",
 "loc" : 1,
 "time_id" : 20121001,
 "placed_items" : [
  {
   "_id" : "5043",
   "category_id" : 101,
   "full_id" : 901,
  },
  {
   "_id" : "5044",
   "category_id" : 102,
   "full_id" : 902,
  }
 ],
}
{
 "_id" : "5042",
 "name" : "entry3",
 "loc" : 1,
 "time_id" : 20121001,
 "placed_items" : [
  {
   "_id" : "5045",
   "category_id" : 101,
   "full_id" : 903,
  },
 ],
}

此示例的预期结果是:

"name"    "full_id" "category_id"
"entry2"    901         101
"entry2"    902         102
"entry3"    903         101

因此,如果placed_items为空,请将条目放在数据框中,如果placed_items包含n条目,请在数据框中放置n个条目

我试图找出一个RBlogger示例来创建所需的数据帧。

#Set up database
    mongo <- mongo.create()

    #Set up condition
    buf <- mongo.bson.buffer.create()
    mongo.bson.buffer.append(buf, "loc", 1)
    mongo.bson.buffer.start.object(buf, "time_id")
    mongo.bson.buffer.append(buf, "$gte", 20120930)
    mongo.bson.buffer.append(buf, "$lte", 20121002)
    mongo.bson.buffer.finish.object(buf)
    query <- mongo.bson.from.buffer(buf)

    #Count  
    count <- mongo.count(mongo, "items_test.overallitem", query) 

#Note that these counts don't work, since the count should be based on 
#the number of placed_items in the array, and not the number of entries. 

    #Setup Cursor
    cursor <- mongo.find(mongo, "items_test.overallitem", query)
    #Create vectors, which will be filled by the while loop
    name <- vector("character", count)
    full_id<- vector("character", count)
    category_id<- vector("character", count) 

    i <- 1
    #Fill vectors
    while (mongo.cursor.next(cursor)) {
        b <- mongo.cursor.value(cursor)
        order_id[i] <- mongo.bson.value(b, "name")
        product_id[i] <- mongo.bson.value(b, "placed_items.full_id")
        category_id[i] <- mongo.bson.value(b, "placed_items.category_id")
        i <- i + 1
    }
    #Convert to dataframe
    results <- as.data.frame(list(name=name, full_id=full_uid, category_id=category_id))

如果我想要在overallitem级别(即_idname)提取值,但无法收集有关{{1}的信息,则条件有效且代码有效水平。此外,用于提取placed_itemsfull_id的点线调用似乎不起作用。有人可以帮忙吗?

0 个答案:

没有答案