Question

我正在阅读此StackOverFlow discussion将JSON转换为CSV并且看起来很棒，但我无法获得基本的jq工作......我不确定我做错了什么。我尝试了基本的东西，我不能解决什么错误。这是我在Shell脚本中的ES查询

curl -XGET 'http://es-1:9200/data_latest/customer/_search?pretty' -H 'Content-Type: application/json' -d'
{
"_source": ["customer_app_version", "customer_num_apps", "customer_name","app_disk_size_bytes","app_memory_capacity_bytes"],
    "query": {
        "bool": {
            "must": [{
                "term": {
                    "is_app_customer": {
                        "value": "true"
                    }
                }
            }]
        }
    },
    "aggs": {
        "Customer_UUID": {
            "terms": {
                "field": "customer_uuid",
                "size": 100
            }
        }
    }
}

” Shell脚本输出

{
  "took": 8,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 6171,
    "max_score": 1.8510876,
    "hits": [
      {
        "_index": "data_latest_v1",
        "_type": "customer",
        "_id": "0003245-4844-9015-1z2e-d4ae5234rd56",
        "_score": 1.8510876,
        "_source": {
          "customer_app_version": "el7.20150513",
          "customer_num_apps": 3,
          "app_memory_capacity_bytes": 405248409600,
          "customer_name": "Timbuktu Inc",
          "app_disk_size_bytes": 25117047875604
        }
      },
      {
        "_index": "data_latest_v1",
        "_type": "customer",
        "_id": "0003245-4844-9015-1z2e-d4ae5234rd56",
        "_score": 1.8510876,
        "_source": {
          "customer_app_version": "el4.20150513",
          "customer_num_apps": 34,
          "app_memory_capacity_bytes": 58923439600,
          "customer_name": "Bunnies Inc",
          "app_disk_size_bytes": 36517984275604
        }
      }
    ]
  }
}

（截断，但上面的子集在语法上有效）

如何在shell脚本中使用jq将_source字段中的Keys和值（没有别的）输出为CSV？我知道我在问其他讨论中描述的内容，但是我试过了，但却无法理解

例如，我在'（上述脚本的结尾）之后添加了我添加的内容 | jq -r'。“customer_name”'

并尝试了

| jq -r'.customer_name'

两者都得到这样的输出。

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
103 13566  100 13566    0   346   507k  13248 --:--:-- --:--:-- --:--:--  537k
null

我做错了什么？我需要做什么？如果有人可以在这里指导我，将会非常有帮助。

Answer 1

要在jq查询中描述如何在文档中导航到要提取的数据，可能如下所示：

jq -r '.hits.hits[]._source.customer_name'

在这种情况下，输出为：

Timbuktu Inc
Bunnies Inc

要生成键/值CSV，可以使用：

jq -r '.hits.hits[]._source | to_entries | .[] | [.key, .value] | @csv'

...带输出：

"customer_app_version","el7.20150513"
"customer_num_apps",3
"app_memory_capacity_bytes",405248409600
"customer_name","Timbuktu Inc"
"app_disk_size_bytes",25117047875604
"customer_app_version","el4.20150513"
"customer_num_apps",34
"app_memory_capacity_bytes",58923439600
"customer_name","Bunnies Inc"
"app_disk_size_bytes",36517984275604

如果您希望客户名称为其自己的列，则可能改为：

jq -r '.hits.hits[]._source | .customer_name as $name | del(.customer_name) | to_entries | .[] | [$name, .key, .value] | @csv'

...带输出：

"Timbuktu Inc","customer_app_version","el7.20150513"
"Timbuktu Inc","customer_num_apps",3
"Timbuktu Inc","app_memory_capacity_bytes",405248409600
"Timbuktu Inc","app_disk_size_bytes",25117047875604
"Bunnies Inc","customer_app_version","el4.20150513"
"Bunnies Inc","customer_num_apps",34
"Bunnies Inc","app_memory_capacity_bytes",58923439600
"Bunnies Inc","app_disk_size_bytes",36517984275604

如果您愿意对列名进行硬编码，请考虑改为：

jq -r '.hits.hits[]._source | [.customer_name, .customer_app_version, .customer_num_apps, .app_memory_capacity_bytes, .app_disk_size_bytes] | @csv'

带输出：

"Timbuktu Inc","el7.20150513",3,405248409600,25117047875604
"Bunnies Inc","el4.20150513",34,58923439600,36517984275604

使用jq从ElasticSearch结果中的_source文档中提取数据

1 个答案: