jq:按嵌套结构分组并展平JSON

时间:2018-12-05 10:42:44

标签: json jq

一般来说,我是jq和命令行工具的新手,但是我需要对JSON文件中的嵌套结构进行分组并展平嵌套的结构,而且几天来我都找不到可行的解决方案,这是我的JSON示例。

[
  {
    "Value1": "0",
    "Conversions": "0",
    "Revenue": "0.00",
    "serverTimestamp": 84615198,
    "pluginsIcons": [
      {
        "pluginName": "pdf",
        "pluginIcon": "pdf1"
      },
      {
        "pluginName": "java",
        "pluginIcon": "java1"
      }
    ],
    "plugins": "pdf, java",
    "customVariables": {
      "3": {
        "customVariableValue3": "F",
        "customVariableName3": "Gender"
      },
      "2": {
        "customVariableValue2": "Person",
        "customVariableName2": "Role"
      },
      "1": {
        "customVariableValue1": "Partner1",
        "customVariableName1": "Partner"
      }
    },
    "interactions": "7",
    "actions": "3",
    "actionDetails": [
      {
        "timestamp": 84615195,
        "interactionPosition": "1",
        "type": "action"
      },
      {
        "timestamp": 84615145,
        "interactionPosition": "2",
        "type": "action"
      },
      {
        "timestamp": 84615693,
        "interactionPosition": "3",
        "type": "action",
        "customVariables": {
          "2": {
            "customVariablePageValue2": "value2",
            "customVariablePageName2": "name2"
          },
          "1": {
            "customVariablePageValue1": "value1",
            "customVariablePageName1": "name1"
          }
        }
      }
    ],
    "operatingSystem": "Windows 10"
  },
  {
    "Value1": "18",
    "Conversions": "1",
    "Revenue": "0.00",
    "serverTimestamp": 84615189,
    "pluginsIcons": [
      {
        "pluginName": "pdf",
        "pluginIcon": "pdf1"
      }
    ],
    "plugins": "pdf",
    "customVariables": {
      "3": {
        "customVariableValue3": "M",
        "customVariableName3": "Gender"
      },
      "2": {
        "customVariableValue2": "Admin",
        "customVariableName2": "Role"
     },
      "1": {
        "customVariableValue1": "Place",
        "customVariableName1": "Subdomain"
      }
    },
    "interactions": "6",
    "actions": "3",
    "actionDetails": [
      {
        "timestamp": 84635189,
        "timeSpent": "11",
        "interactionPosition": "1",
        "type": "action"
      },
      {
        "timestamp": 846351834,
        "timeSpent": "11",
        "interactionPosition": "2",
        "type": "search"
      },
      {
        "timestamp": 846351832,
        "timeSpent": "1",
        "interactionPosition": "3",
        "type": "action",
        "customVariables": {
          "2": {
            "customVariablePageValue2": "value2",
            "customVariablePageName2": "name2"
          },
          "1": {
            "customVariablePageValue3": "value3",
            "customVariablePageName3": "name3"
          }
        },
        "generationTime": "890"
      }
    ],
    "operatingSystem": "Windows 10"
  }
]

查看最终结果的方式是在“ actionDetails”下的嵌套数组中为每个“动作”添加一个扁平条目

我已经能够弄平结构,但是然后进行分组(并为每个操作复制其他列)变得令人费解。在拼合前按“操作”进行分组对我来说不起作用,因为它们是嵌套的。

原始JSON之后的第一个条目应如何显示的示例是:

[
  {
    "timestamp": 84615195,
    "interactionPosition": "1",
    "type": "action",
    "Value1": "0",
    "Conversions": "0",
    "Revenue": "0.00",
    "pluginName1": "pdf",
    "pluginIcon1": "pdf",
    "pluginName2": "java",
    "pluginIcon2": "java",
    "plugins": "pdf, java",
    "Gender": "F",
    "Role": "Person",
    "Partner": "Partner1",
    "interactions": "7",
    "actions": "3",
    "operatingSystem": "Windows 10"
  },
  {
    "timestamp": 84615145,
    "interactionPosition": "2",
    "type": "action",
    "Value1": "0",
    "Conversions": "0",
    "Revenue": "0.00",
    "pluginName1": "pdf",
    "pluginIcon1": "pdf",
    "pluginName2": "java",
    "pluginIcon2": "java",
    "plugins": "pdf, java",
    "Gender": "F",
    "Role": "Person",
    "Partner": "Partner1",
    "interactions": "7",
    "actions": "3",
    "operatingSystem": "Windows 10"
  },
  {
    "timestamp": 84615693,
    "interactionPosition": "3",
    "type": "action",
    "Value1": "0",
    "Conversions": "0",
    "Revenue": "0.00",
    "pluginName1": "pdf",
    "pluginIcon1": "pdf",
    "pluginName2": "java",
    "pluginIcon2": "java",
    "plugins": "pdf, java",
    "Gender": "F",
    "Role": "Person",
    "Partner": "Partner1",
    "interactions": "7",
    "actions": "3",
    "operatingSystem": "Windows 10",
    "name1": "value1",
    "name2": "value2"
   }
]

在上面您可能会注意到,一些扁平化的键名已被关联的值替换(在同一嵌套结构内)。这不是完全必要的,但这将是一个不错的奖励。同样值得注意的是:我的JSON非常大(800MB),我想这样做,但是我想最好在另一个问题中提出这一点。

在此先感谢您的帮助或建议!

1 个答案:

答案 0 :(得分:0)

以下答案不能满足您提到的所有要求 但这有望使您克服显然已经面临的主要障碍。

由于我对您对“ customVariables”的要求不清楚, 我将完全忽略.customVariables,希望您也 一旦遇到主要障碍,便能够处理.plugins图标。 因此,为清楚起见,我将删除这些键。

据我了解,您希望在基于展平的情况下进行一些分组 在.actionDetails上。这些要求我也不清楚,所以让我们 专注于展平:

.[]
| .actionDetails[] + (del(.actionDetails) | del(.customVariables) | del(.pluginsIcons))

这将生成JSON对象流,其中前两个是:

{
  "timestamp": 84615195,
  "interactionPosition": "1",
  "type": "action",
  "Value1": "0",
  "Conversions": "0",
  "Revenue": "0.00",
  "serverTimestamp": 84615198,
  "plugins": "pdf, java",
  "interactions": "7",
  "actions": "3",
  "operatingSystem": "Windows 10"
}
{
  "timestamp": 84615145,
  "interactionPosition": "2",
  "type": "action",
  "Value1": "0",
  "Conversions": "0",
  "Revenue": "0.00",
  "serverTimestamp": 84615198,
  "plugins": "pdf, java",
  "interactions": "7",
  "actions": "3",
  "operatingSystem": "Windows 10"
}

这与您显示的预期输出非常相似,因此希望您可以从此处获取。