使用bash

时间:2015-11-24 19:30:16

标签: json bash curl awk

我有一大块json,其中包含大约10个独特元素。这些元素中的每一个都包含一个ID,一些其他属性和一个链接属性(其中一些也有ID)。有没有办法,我可以使用bash(最好没有外部库)只获取json的每个元素中的顶级ID?

以下是一个例子:

{
"page": {
    "size": 10,
    "number": 1,
    "totalPages": 1,
    "totalElements": 10,
    "resultSetId": "TODO",
    "duration": 999
},
"content": [
    {
        "id": "fbc67d7a-50a3-4c1c-9a75-4db0ba5dcb07",
        "name": "volume 0",
        "userTags": [],
        "links": [
            {
                "rel": "whatever",
                "href": "/whatever/67b46e10-21ed-4394-b706-9eb61d75933e",
                "id": "67b46e10-21ed-4394-b706-9eb61d75933e"
            },
            {
                "rel": "whatever_else",
                "href": "/whatever_else/fbc67d7a-50a3-4c1c-9a75-4db0ba5dcb07/workflowList"
            },
            {
                "rel": "stuff",
                "href": "/stuff/fbc67d7a-50a3-4c1c-9a75-4db0ba5dcb07/planList"
            },
            {
                "rel": "self",
                "href": "/self/fbc67d7a-50a3-4c1c-9a75-4db0ba5dcb07",
                "id": "fbc67d7a-50a3-4c1c-9a75-4db0ba5dcb07"
            },
            {
                "rel": "container",
                "href": "/container/575a0c38-c60a-4d52-ba38-cb20f4b6d9e7",
                "id": "575a0c38-c60a-4d52-ba38-cb20f4b6d9e7"
            },
            {
                "rel": "parent",
                "href": "/parent/85b7f0e7-b946-4bc4-9ca6-582a5ca08c51",
                "id": "85b7f0e7-b946-4bc4-9ca6-582a5ca08c51"
            }
        ],
        "discovered": false,
        "lastUpdated": "2015-11-20T09:33:05.757-0800",
        "nativeUri": null,
        "vendor": null,
        "suspended": [],
        "enabled": [],
    },
    {
        "id": "4292014f-01cd-4369-9cc0-7bf41a8be53d",
        "name": "Storage_Group_001",
        "attributes": {},
        "userTags": [],
        "links": [
            {
                "rel": "stuff",
                "href": "/stuff/67b46e10-21ed-4394-b706-9eb61d75933e",
                "id": "67b46e10-21ed-4394-b706-9eb61d75933e"
            },
            {
                "rel": "something",
                "href": "/something/4292014f-01cd-4369-9cc0-7bf41a8be53d/workflowList"
            },
            {
                "rel": "whatever",
                "href": "/whatever/4292014f-01cd-4369-9cc0-7bf41a8be53d/planList"
            },
            {
                "rel": "self",
                "href": "/self/4292014f-01cd-4369-9cc0-7bf41a8be53d",
                "id": "4292014f-01cd-4369-9cc0-7bf41a8be53d"
            },
            {
                "rel": "container",
                "href": "/stuff/575a0c38-c60a-4d52-ba38-cb20f4b6d9e7",
                "id": "575a0c38-c60a-4d52-ba38-cb20f4b6d9e7"
            }
        ],
        "lastUpdated": "2015-11-18T06:37:56.739-0800",
        "nativeUri": null,
        "vendor": null,
        "suspended": [],
        "enabled": [],
    },
    {
        "id": "896aca64-17a6-4acb-a93c-562424dc1bc4",
        "name": "volume 4",
        "attributes": {},
...

基本上,我只想获得每个部分的最高ID,但链接部分中没有任何ID。我接近使用awk,也使用perl,但是无法预测链接部分中包含的确切数量的id。这是我的awk尝试(假设在所需的id之间恰好有5个条目。我也只是将json转储到临时文件中,所以我不必每次都卷曲):

awk '{if (count++%5==0) print $0;}' <(cat tmp.txt | grep -Po '(?<="id":")[^"]*')

2 个答案:

答案 0 :(得分:1)

使用jq

jq '.content[] | .id' some.json

答案 1 :(得分:1)

这是一个仅限awk的“解决方案”(解决方案有点乐观,因为awk不是json-parser):

awk '$0 ~ /{/ {count++} 
     $0 ~ /}/ {count--} 
     $0 ~ "\"id\":"&& count==2 {print $0}' inputFile

我们计算打开和关闭卷曲支架的数量 最后,我们打印包含"id"的所有行并打印出来。 您的示例的输出:

"id": "fbc67d7a-50a3-4c1c-9a75-4db0ba5dcb07",
"id": "4292014f-01cd-4369-9cc0-7bf41a8be53d",
"id": "896aca64-17a6-4acb-a93c-562424dc1bc4",

此解决方案假设每行最多有一个括号({})。

或者,你可能会看一下jsawk哪个就像awk,但对于JSON 。 (如果你可以chmod该文件,那么它可能是更好的选择。)