我当前收藏中的文档,我们称之为原文
{ "city" : [ "Delhi" ], "location" : [ "Dwarka" ] , "tags" : [ "Estate Agents For Residential", "Estate Agents", "Agents For Residence" ] }
{ "city" : [ "Delhi" ], "location" : [ "Dwarka" ], "tags" : [ "Estate Agents For Residential", "Estate Agents", "Commercial Rental" ]}
{ "city" : [ "Delhi" ], "location" : [ "Dwarka" ], "tags" : [ "Estate Agents For Residential", "Estate Agents" ] }
{ "city" : [ "Delhi" ], "location" : [ "South Extension" ], "tags" : [ "Estate Agents For Residence" ] }
{ "city" : [ "Delhi" ], "location" : [ "Greater Kailash II" ], "tags" : [ "Estate Agents For Residence" ] }
{ "city" : [ "Delhi" ], "location" : [ "Greater Kailash II" ], "tags" : [ "Estate Agents For Rental" ] }
我想从原始系列中生成的第一个系列,我们称之为城市位置
{ "city" : [ "Delhi" ], "locations" : [ "Dwarka", "South Extension", "Greater Kailash II" ] }
我想从原始系列中生成的第二个系列,我们称之为city-location-tags
{ "city" : [ "Delhi" ], "location" : [ "Dwarka" ], "tags" : [ "Estate Agents For Residential", "Estate Agents", "Agents For Residence", "Commercial Rental" ] }
{ "city" : [ "Delhi" ], "location" : [ "South Extension" ], "tags" : [ "Estate Agents For Residence" ]}
{ "city" : [ "Delhi" ], "location" : [ "Greater Kailash II" ], "tags" : [ "Estate Agents For Residence", "Estate Agents For Rental" ] }
我的挑战: 我的原始馆藏有超过一百万份文件,并为所选城市提取所有位置,并从所选城市获取所有标签。填充相关下拉列表的位置需要很长时间。通过创建较小的集合,我试图实现更快的响应时间。在我的项目中,当用户从下拉列表中选择一个城市时,我必须在下一个下拉列表中显示所选城市的所有可用位置,并且在选择该位置后,我必须显示该位置的所有可用标签+下一次下拉,这必须快速发生。
感谢您的帮助
答案 0 :(得分:1)
您的问题可以通过使用聚合框架来解决。完整的参考资料可以在http://docs.mongodb.org/manual/aggregation/
找到您的第一个结果集可以通过以下内容创建:
db.original.aggregate([
{$group : {_id: {city:"$city", location:"$location"} } },
{$project: {_id:0, city: "$_id.city", location: "$_id.location"} },
{$unwind: "$location"},
{$group : {_id: "$city", locations: { $addToSet: "$location"} } },
{$project: {_id:0, city: "$_id", locations: "$locations"} }
])
您的第二个结果集cat由以下内容创建:
db.original.aggregate([
{$unwind: "$tags"},
{$group: { _id: { city:"$city", location:"$location"}, tags: { $addToSet: "$tags" } } },
{$project: { _id:0, city:"$_id.city", location:"$_id.location", tags:"$tags" } }
])
但是,我对是否需要创建不同的集合持怀疑态度,因为每次在原始集合中进行更新时都需要删除并重新创建它们。缓存每个城市的结果(特别是在第二种情况下)会更有意义,并且只要您有更新,就可以使密钥无效。
另外,为什么要在结果中包含单元素列表(城市,位置)?
答案 1 :(得分:0)
如果将城市更改为json值而不是json数组,则可以使用简单命令执行此操作,因为聚合集合的_id不能是数组。你真的需要城市成为原始文件中的阵列吗?
您可以这样做的方式如下:
var cityLocations = db.original.aggregate({$group : {_id:"$city", locations: {$addToSet: "$location"}}});
db.createCollection("cityLocations");
db.cityLocations.insert(cityLocations.result);
您也可以类似地将其分组为city-location-tags。查看Aggregation commands
答案 2 :(得分:0)
我可以对@ Jinxcat的回答提供更正。运行聚合的结果将city字段留空,因为group / addToSet聚合结果中没有city字段。城市数据位于_id字段中。因此,在最终的$ project聚合中,对_id.city的引用是无关紧要的,应仅引用_id。
db.india.aggregate([
{$group: {_id:{city:"$city", location:"$location"}}},
{$project: {_id:0, city: "$_id.city", location:"$_id.location"}},
{$group : {_id:"$city", locations: {$addToSet: "$location"}} },
{$project: {_id:0, city: "$_id", locations: "$locations"} }
])
粗体编辑已更改:{$ project:{_ id:0,city:“$ _ id”,位置:“$ locations”}}
答案 3 :(得分:0)
嗨@Kumar Deepam这应该按照正确的顺序产生第二组密钥:城市,位置和标签。
db.india.aggregate([
{$unwind : "$city"}, {$unwind : "$location"}, {$unwind : "$tags"},
{$group: {_id:{city : "$city", location :"$location"}, tags : {$addToSet:"$tags"}}},
{$group: {_id:{city : "$_id.city", location : "$_id.location", tags : "$tags"}}},
{$project: {_id:0, city : "$_id.city", location : "$_id.location", tags : "$_id.tags"}}
])
它比@ Jinxcat的答案更长,但如果以正确的顺序输入密钥很重要,那么这应该可以胜任。
BTW我想知道是否有可能优化这种聚合?有人有任何想法吗?谢谢大家。