使用JQ解析JSON嵌套对象,使用select来匹配嵌套对象中的键值,同时显示现有结构

时间:2019-12-17 23:03:44

标签: json nested jq

使用JQ解析JSON嵌套对象,使用select来匹配嵌套对象中的键值,同时显示现有结构

我正在尝试获取一个20,000行以上的复杂JSON文件并提取特定的密钥,同时保留周围的元数据,从而增加了必要的人类可理解的上下文。


数据源(复杂结构):

{
  "Marketplace": [
    {
      "Level1Name": "Company A Products",
      "Level1Array": [
        {
          "Level2Name": "USA Products List",
          "Level2Contents": [
            {
              "Level3Name": "ALL",
              "Level3URL": "https://a.com/products"
            },
            {
              "Level3Name": "Subset1001",
              "Level3URL": "https://a.com/products/subset1001"
            }
          ]
        }
      ]
    },
    {
      "Level1Name": "Company B Products",
      "Level1Array": [
        {
          "Level2Name": "USA Products List",
          "Level2Contents": [
            {
              "Level3Name": "ALL",
              "Level3URL": "https://b.com/products"
            },
            {
              "Level3Name": "Subset500",
              "Level3URL": "https://b.com/products/subset500"
            }
          ]
        },
        {
          "Level2Name": "EU Products List",
          "Level2Contents": [
            {
              "Level3Name": "ALL",
              "Level3URL": "https://b.eu/products"
            },
            {
              "Level3Name": "Subset200",
              "Level3URL": "https://b.eu/products/subset200"
            }
          ]
        }
      ]
    },
    {
      "Level1Name": "Company X Products",
      "Level1Array": [
        {
          "Level2Name": "Deleted Products",
          "Level2URL": "https://internal.x.com/products"
        }
      ]
    }
  ]
}

当前用于提取的JQ命令会删除所有其他上下文元数据...

jq -r '(
         .Marketplace[].Level1Array[].Level2Contents[]
         | select (.Level3Name | index("ALL"))
         | [.]
         )'

已给出输出...

[
  {
    "Level3Name": "ALL",
    "Level3URL": "https://a.com/products"
  }
]
[
  {
    "Level3Name": "ALL",
    "Level3URL": "https://b.com/products"
  }
]
[
  {
    "Level3Name": "ALL",
    "Level3URL": "https://b.eu/products"
  }
]

希望输出选项1,相同的JSON结构,并删除所有不匹配的其他对象,请选择过滤条件“ ALL”字符串条件

{
    "Marketplace":
  [
        {
            "Level1Name": "Company A Products",
            "Level1Array": [
                {
                    "Level2Name": "USA Products List",
                    "Level2Contents": [
                        {
                            "Level3Name": "ALL",
                            "Level3URL": "https://a.com/products"
                        }
                    ]
                }
            ]
        },
        {
            "Level1Name": "Company B Products",
            "Level1Array": [
                {
                    "Level2Name": "USA Products List",
                    "Level2Contents": [
                        {
                            "Level3Name": "ALL",
                            "Level3URL": "https://b.com/products"
                        }
                    ]
                },
                {
                    "Level2Name": "EU Products List",
                    "Level2Contents": [
                        {
                            "Level3Name": "ALL",
                            "Level3URL": "https://b.eu/products"
                        }
                    ]
                }
            ]
        }
    ]
}

需要选项2输出,可以通过循环来迭代的任何类似格式,例如:

{
  "Marketplace":
  [
    {
      "Level1Name": "Company A Products",
      "Level2Name": "USA Products List",
      "Level3Name": "ALL",
      "Level3URL": "https://a.com/products"
    },
    {
      "Level1Name": "Company B Products",
      "Level2Name": "USA Products List",
      "Level3Name": "ALL",
      "Level3URL": "https://b.com/products"
    },
    {
      "Level1Name": "Company B Products",
      "Level2Name": "EU Products List",
      "Level3Name": "ALL",
      "Level3URL": "https://b.eu/products"
    }
  ]
}

2 个答案:

答案 0 :(得分:0)

以下过滤器产生“选项2”输出:

.Marketplace |= map(
  {Level1Name} as $Level1Name
  | .Level1Array[]
  | {Level2Name} as $Level2Name
  | .Level2Contents[]?
  | select(.Level3Name == "ALL")
  | $Level1Name + $Level2Name + . )

破坏它...

了解这一点的一种方法是考虑:

.Marketplace[]
| {Level1Name} as $Level1Name
| .Level1Array[]
| {Level2Name} as $Level2Name
| .Level2Contents[]?             # in case .Level2Contents is missing
| if (.Level3Name == "ALL")
  then $Level1Name + $Level2Name + .
  else empty
  end

附录:“名称”

OP随后询问如果三个级别的“名称”键都都命名为“名称”,该怎么办。通过对上述内容进行调整,可以很容易地获得答案:

.Marketplace |= map(
  {Level1Name: .Name} as $Level1Name
  | .Level1Array[]
  | {Level2Name: .Name} as $Level2Name
  | .Level2Contents[]?
  | select(.Name == "ALL")
  | $Level1Name + $Level2Name + . )

输出

在这种情况下,输出如下:

{
  "Marketplace": [
    {
      "Level1Name": "Company A Products",
      "Level2Name": "USA Products List",
      "Name": "ALL",
      "Level3URL": "https://a.com/products"
    },
    {
      "Level1Name": "Company B Products",
      "Level2Name": "USA Products List",
      "Name": "ALL",
      "Level3URL": "https://b.com/products"
    },
    {
      "Level1Name": "Company B Products",
      "Level2Name": "EU Products List",
      "Name": "ALL",
      "Level3URL": "https://b.eu/products"
    }
  ]
}

答案 1 :(得分:0)

这里是您可以解决此问题的另一种方法。据我了解,您想要一种方法来搜索对象的递归树中的某个值,并删除所有不具有该值的属性的对象。

您可以做的是搜索要保留的所有值的路径(具有要搜索的值),然后删除要保留的任何路径的路径上没有的所有其他对象。 / p>

def is_subpath($paths): [., length] as [$path, $length] |
    any($paths[]; $length <= length and $path == .[:$length]);
[paths(strings == "ALL")[:-1]] as $keepers
| delpaths([paths(objects) | select(is_subpath($keepers) | not)])
{
  "Marketplace": [
    {
      "Level1Name": "Company A Products",
      "Level1Array": [
        {
          "Level2Name": "USA Products List",
          "Level2Contents": [
            {
              "Level3Name": "ALL",
              "Level3URL": "https://a.com/products"
            }
          ]
        }
      ]
    },
    {
      "Level1Name": "Company B Products",
      "Level1Array": [
        {
          "Level2Name": "USA Products List",
          "Level2Contents": [
            {
              "Level3Name": "ALL",
              "Level3URL": "https://b.com/products"
            }
          ]
        },
        {
          "Level2Name": "EU Products List",
          "Level2Contents": [
            {
              "Level3Name": "ALL",
              "Level3URL": "https://b.eu/products"
            }
          ]
        }
      ]
    }
  ]
}