我有以下 JSON 上下文,我需要聚合 conversion_token
组下的不同关键字及其重复次数;例如:
"conversion_token": [
{
"keyword": "DBMS",
"count":4,
"classify":2
}
关键字 DBMS 在提供的 json 中的不同结构下多次使用,聚合应显示
"conversion_token": [
{
"keyword": "DBMS",
"count":6,
"classify":2
}
{
"keyword": "NVL",
"count":2,`enter code here`
"classify":2
}
等等。
我该怎么做?
{
"select_emp": {
"specification": {
"input": [
"p_empno"
],
"declare_stmt": {
"anchorvariable": [
"V_ENAME",
"V_HIREDATE",
"V_TITLE",
"V_REPORTSTO",
"V_DISP_DATE",
"V_INS_COUNT",
"CITY_FROM"
],
"tablename_variable": [
"EMPLOYEE.V_ENAME",
"EMPLOYEE.V_HIREDATE",
"EMPLOYEE.V_TITLE",
"EMPLOYEE.V_REPORTSTO",
"EMPLOYEE.V_DISP_DATE",
"EMPLOYEE.V_INS_COUNT",
"EMPLOYEE.CITY_FROM"
]
}
},
"body": {
"select_stmt1": {
"columns": [
"FIRSNAME",
"HIREDATE",
"TITLE",
"REPORTSTO"
],
"tablename": [
"EMPLOYEE"
],
"conversion_token": [
{
"keyword": "NVL",
"count": 1,
"classify": 2
}
]
},
"select_stmt2": {
"columns": [
"CITY"
],
"tablename": [
"EMPLOYEE"
],
"conversion_token": [
{
"keyword": "DECODE",
"count": 1,
"classify": 3
}
]
},
"dbms_stmt1": {
"dbms_putline": [
"P_EMPNO",
"V_ENAME",
"V_DISP_DATE",
"V_REPORTSTO"
],
"conversion_token": [
{
"keyword": "DBMS",
"count": 1,
"classify": 2
}
]
},
"forloop1": {
"select_stmt": {
"columns": [
"EMPLOYEEID",
"ROWID"
],
"tablename": [
"EMPLOYEE"
],
"conversion_token": [
{
"keyword": "DBMS",
"count": 1,
"classify": 2
}
]
}
},
"merge_stmt1": {
"merge_into": "EMPLOYEE",
"merge_using": {
"columns": [
"EMPLOYEEID",
"LASTNAME",
"TITLE",
"BIRTHDATE",
"HIREDATE",
"ADDRESS",
"CITY",
"STATE",
"COUNTRY",
"POSTALCODE",
"PHONE",
"FAX",
"EMAIL",
"BONUS"
],
"tablename": [
"EMPLOYEE"
]
},
"merge_update": {
"columns": [
"BONUS"
],
"tablename": [
"EMPLOYEE"
]
},
"merge_delete": {
"columns": [
"BONUS"
],
"tablename": [
"EMPLOYEE"
]
},
"merge_insert": {
"columns": [
"EMPLOYEEID",
"LASTNAME",
"FIRSTNAME",
"TITLE",
"BIRTHDATE",
"HIREDATE",
"ADDRESS",
"CITY",
"STATE",
"COUNTRY",
"POSTALCODE",
"PHONE",
"FAX",
"EMAIL",
"BONUS"
],
"tablename": [
"EMPLOYEE"
]
},
"conversion_token": [
{
"keyword": "Merge",
"count": 1,
"classify": 4
}
]
},
"exception_handling1": {
"dbms_putline": [
"P_EMPNO"
],
"conversion_token": [
{
"keyword": "DBMS",
"count": 1,
"classify": 2
}
]
}
}
}
}
答案 0 :(得分:0)
首先,您需要使用 conversion_token
和 project
将 $concatArrays
数组合并为一个。然后使用 $unwind
,您可以分隔每个 conversion_token
对象,使它们准备好进行聚合。在 $unwind
之后,您可以使用 keyword
按 $group
对它们进行分组,并且您可以取 count
的总和。
db.collection.aggregate([
{
$project: {
conversion_tokens: {
$concatArrays:
[
"$select_emp.body.select_stmt1.conversion_token",
"$select_emp.body.select_stmt2.conversion_token",
"$select_emp.body.dbms_stmt1.conversion_token",
"$select_emp.body.forloop1.conversion_token",
"$select_emp.body.merge_stmt1.conversion_token",
"$select_emp.body.exception_handling1.conversion_token"
]
}
}
}, {
$unwind: "$conversion_tokens"
}, {
$group: {
_id: "$conversion_tokens.keyword",
count: {
$sum: "$conversion_tokens.count"
}
}
}
])
这将产生一个数组,如:
{
"_id": "DBMS",
"count": 3
}, {
"_id": "NVL",
"count": 1
}, {
"_id": "DECODE",
"count": 1
}
如果您想将 _id
键更改为 keyword
,您可以再次使用 $project
。