如何使用jq按数字字段过滤?

时间:2018-03-16 14:35:14

标签: json bash select jq bitbucket-api

我正在编写一个脚本来查询Bitbucket API并删除从未下载过的SNAPSHOT工件。此脚本失败,因为它获取所有快照工件,下载数量的选择似乎不起作用。

我的select语句按下载次数过滤对象有什么问题?

当然,如果我可以使用过滤器查询Bitbucket API,那么更直接的解决方案就是。据我所知,API不支持下载过滤。

我的脚本是:

#!/usr/bin/env bash
curl -X GET --user "me:mykey" "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads?pagelen=100" > downloads.json

# get all values | reduce the set to just be name and downloads | select entries where downloads is zero | select entries where name contains SNAPSHOT | just get the name
#TODO i screwed up the selection somewhere its returning files that contain SNAPSHOT regardless of number of downloads
jq '.values | {name: .[].name, downloads: .[].downloads} | select(.downloads==0) | select(.name | contains("SNAPSHOT")) | .name' downloads.json > snapshots_without_any_downloads.js
#unique sort, not sure why jq gives me multiple values
sort -u snapshots_without_any_downloads.js | tr -d '"' > unique_snapshots_without_downloads.js

cat unique_snapshots_without_downloads.js | xargs -t -I % curl -Ss -X DELETE --user "me:mykey" "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/%" > deleted_files.txt

来自API的原始输入的取消标识样本是:

{
  "pagelen": 10,
  "size": 40,
  "values": [
    {
      "name": "myproject_1.1-SNAPSHOT_0210f77_mc_3.5.0.zip",
      "links": {
        "self": {
          "href": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/myproject_1.1-SNAPSHOT_0210f77_mc_3.5.0.zip"
        }
      },
      "downloads": 2,
      "created_on": "2018-03-15T17:50:00.157310+00:00",
      "user": {
        "username": "me",
        "display_name": "me",
        "type": "user",
        "uuid": "{3051ec5f-cc92-4bc3-b291-38189a490a89}",
        "links": {
          "self": {
            "href": "https://api.bitbucket.org/2.0/users/me"
          },
          "html": {
            "href": "https://bitbucket.org/me/"
          },
          "avatar": {
            "href": "https://bitbucket.org/account/me/avatar/32/"
          }
        }
      },
      "type": "download",
      "size": 430894
    },
    {
      "name": "myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip",
      "links": {
        "self": {
          "href": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/myproject_1.1-SNAPSHOT_0210f77_mc_3.5.0.zip"
        }
      },
      "downloads": 0,
      "created_on": "2018-03-15T17:50:00.157310+00:00",
      "user": {
        "username": "me",
        "display_name": "me",
        "type": "user",
        "uuid": "{3051ec5f-cc92-4bc3-b291-38189a490a89}",
        "links": {
          "self": {
            "href": "https://api.bitbucket.org/2.0/users/me"
          },
          "html": {
            "href": "https://bitbucket.org/me/"
          },
          "avatar": {
            "href": "https://bitbucket.org/account/me/avatar/32/"
          }
        }
      },
      "type": "download",
      "size": 430894
    },
    {
      "name": "myproject_1.0_mc_3.5.1.zip",
      "links": {
        "self": {
          "href": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/myproject_1.1-SNAPSHOT_0210f77_mc_3.5.1.zip"
        }
      },
      "downloads": 5,
      "created_on": "2018-03-15T17:49:14.885544+00:00",
      "user": {
        "username": "me",
        "display_name": "me",
        "type": "user",
        "uuid": "{3051ec5f-cc92-4bc3-b291-38189a490a89}",
        "links": {
          "self": {
            "href": "https://api.bitbucket.org/2.0/users/me"
          },
          "html": {
            "href": "https://bitbucket.org/me/"
          },
          "avatar": {
            "href": "https://bitbucket.org/account/me/avatar/32/"
          }
        }
      },
      "type": "download",
      "size": 430934
    }
  ],
  "page": 1,
  "next": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads?pagelen=10&page=2"
}

我希望从此代码段获得的输出是myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip - 该工件是SNAPSHOT并且下载量为零。

我已经使用这个中间步骤进行了一些调试:

jq '.values | {name: .[].name, downloads: .[].downloads} | select(.downloads>0) | select(.name | contains("SNAPSHOT")) | unique' downloads.json > snapshots_with_downloads.js
jq '.values | {name: .[].name, downloads: .[].downloads} | select(.downloads==0) | select(.name | contains("SNAPSHOT")) | .name' downloads.json > snapshots_without_any_downloads.js
#this returns the same values for each list!
diff unique_snapshots_with_downloads.js unique_snapshots_without_downloads.js

这种调整提供了一个更清晰,更独特的结构,它表明jq的某些分裂或流媒体方面我还不完全理解:

#this returns a "unique" array like I expect, adding select to this still does not produce the desired outcome 
jq '.values | [{name: .[].name, downloads: .[].downloads}] | unique' downloads.json

此步骤后的数据如下所示。它刚刚从原始API响应中删除了我不需要的内容:

[
  {
    "name": "myproject_1.0_2400a51_mc_3.4.0.zip",
    "downloads": 0
  },
  {
    "name": "myproject_1.0_2400a51_mc_3.4.1.zip",
    "downloads": 2
  },
  {
    "name": "myproject_1.1-SNAPSHOT_391f4d5_mc_3.5.0.zip",
    "downloads": 0
  },
  {
    "name": "myproject_1.1-SNAPSHOT_391f4d5_mc_3.5.1.zip",
    "downloads": 2
  }
]

2 个答案:

答案 0 :(得分:2)

据我所知:

  • 您想要全局唯一输出
  • 您只需要downloads==0
  • 的项目
  • 您只想要名称中包含“SNAPSHOT”
  • 的项目

以下内容将实现:

jq -r '
[.values[] | {(.name): .downloads}]
| add
| to_entries[]
| select(.value == 0)
| .key | select(contains("SNAPSHOT"))'

这个版本不是使unique成为明确的步骤,而是从名称生成一个映射来下载计数器(add将这些值放在一起 - 这意味着如果发生冲突,最后一个获胜) ,从而确保输出是唯一的。

鉴于您的测试JSON,输出为:

myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip

应用于整体问题上下文,此策略可用于简化整个过程:

jq -r '[.values[] | {(.links.self.href): .downloads}] |  add | to_entries[] | select(.value == 0) | .key | select(contains("SNAPSHOT"))'

它通过对文件的URL而不仅仅是名称来简化整个过程。这简化了后续的DELETE调用。还可以删除sorttr来电。

答案 1 :(得分:2)

这是一个解决方案,根据下载总数进行选择之前,每个.download的{​​{1}}值相加:

.name

示例:

reduce (.values[] | select(.name | contains("SNAPSHOT"))) as $v
  ({}; .[$v.name] += $v.downloads)
| with_entries(select(.value == 0))
| keys_unsorted[]

P.S。

  

我的选择陈述有什么问题......?

跳出的问题是在"选择"之前的管道位。过滤器:

$ jq -r -f program.jq input.json
myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip

以这种方式使用.values | {name: .[].name, downloads: .[].downloads} 会导致形成笛卡尔积 - 也就是说,上面的表达式将发出n * n个JSON集,其中n是.[]的长度。你显然打算写:

.values

可以缩写为:

.values[] | {name: .name, downloads: .downloads}