查找常用字符串,聚合,MongoDB

时间:2017-07-25 18:20:52

标签: mongodb aggregation-framework

Doc 1:

{
  typeId: 'A1',
  name: 'EAGLE-25'
}

Doc 2:

{
  typeId: 'A1',
  name: 'EAGLE-32'
}

Doc 3:

{
  typeId: 'B1',
  name: 'FOX5'
}

Doc 4:

{
  typeId: 'B1',
  name: 'FOX15'
}

汇总查询后的通缉结果:

[
  {
     typeId: 'A1',
     commonName: 'EAGLE',
     names: ['EAGLE-25', 'EAGLE-32']
  },
  {
     typeId: 'B1',
     commonName: 'FOX',
     names: ['FOX5', 'FOX15']
  }
]

聚合框架可以实现吗?

1 个答案:

答案 0 :(得分:1)

你走了:

db.getCollection('test').aggregate
(
  {
    $group:
    {
      _id:
      {
        "typeId": "$typeId",
        "commonName": { "$substrCP": [ "$name", 0, { "$indexOfCP": [ "$name", "-" ] } ] } // group by substring of "name" property 
      },
      "names": { $push: "$name" } // create the "names" array per group
    }
  },
  {
    $project:
    {
      "_id": 0, // get rid of _id field
      "typeId": "$_id.typeId", // flatten "_id.typeId" into "typeId"
      "commonName": "$_id.commonName", // flatten "_id.commonName" into "commonName"
      "names": "$names" // include "names" array the way it is
    }
  }
)

与MongoDB聚合一样,您可以通过从查询结束开始逐步减少投影阶段来了解正在发生的事情。

修改

在您最近更改问题后,我的上述答案不再太有意义了。而且我想不出让你的通用“最小公分母”查询工作的方法。

但更重要的是,我认为你的规格中缺少一些东西。想象一下,您的数据库中包含以下元素:

{
  typeId: 'A1',
  name: 'EAGLE-25'
}

{
  typeId: 'A1',
  name: 'EATS-26'
}

{
  typeId: 'A1',
  name: 'EVERYTHING-27'
}

您对“最小公分母”概念的看法是:

[
  {
     typeId: 'A1',
     commonName: 'E',
     names: ['EAGLE-25', 'EATS-26', 'EVERYTHING-27']
  }
]

看来这个结果不再有意义......?!

编辑2:

我有一个想法,假设你可以定义“公共前缀”的最大长度。我认为,我们非常接近你想要的东西:

db.getCollection('eagle').aggregate
(
  {
    $project:
    {
      "range": {$range: [ 1, 10, 1 ]}, // adjust the '10' to match the maximum length of your "least common prefix"
      "typeId": "$typeId",
      "name": "$name"
    }
  },
  { $unwind: "$range" },
  {
    $project:
    {
      "typeId": "$typeId",
      "name": "$name",
      "commonName": { $substrCP: ["$name", 0, "$range"] } // extract the first couple of characters from the name
    }
  },
  {
    $group: { _id: {"typeId": "$typeId", "commonName": "$commonName"}, "names": { $addToSet: "$name" } }
  },
  {
    $project:
    {
      "_id": 0, // get rid of _id field
      "typeId": "$_id.typeId", // flatten "_id.typeId" into "typeId"
      "commonName": "$_id.commonName", // flatten "_id.commonName" into "commonName"
      "names": "$names" // include "names" array the way it is
    }
  }
)