使用pyjq,我可以解析json文件中的值。我需要进一步格式化输出位,以便可以将其导出到csv。
import json
import csv
import pyjq
emp_data = open('example.json', 'r')
emp_data_parsed = json.loads(emp_data.read())
emp = pyjq.all ('.base[].base[].uid, .base[].base[].name', emp_data_parsed)
print emp
我得到的输出
[u'2da21174-0af8-4b5b-b02e-2957a24d70e1', u'fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba', u'4ecf6450-7307-466c-bf19-663ba2fbaf69', None, u'Tommy', u'Sam',
预期如下输出,以便可以将其写入csv文件。
uid,name
'2da21174-0af8-4b5b-b02e-2957a24d70e1','None'
'fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba','Tommy'
'4ecf6450-7307-466c-bf19-663ba2fbaf69','Sam'
以下是sample.json文件
example.json
{
"base": [
{
"base": [
{
"item-number": 1,
"type": "access-item",
"uid": "2da21174-0af8-4b5b-b02e-2957a24d70e1",
"usage": {
"last-date": {
"iso-8601": "2018-03-19T03:58-0500",
},
},
"item-number": 2,
"name": "Tommy",
"type": "access-item",
"uid": "fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba",
"item-number": 3,
"name": "Sam",
"type": "access-item",
"uid": "4ecf6450-7307-466c-bf19-663ba2fbaf69",
"usage": {
"last-date": {
"iso-8601": "2018-03-21T07:21-0500",
},
},
}
],
}
],
}
除了pyjq之外,我不确定是否有这样做的方法。如果是这样,请告诉我。
答案 0 :(得分:2)
问题:我需要进一步格式化输出位,以便可以将其导出到csv。
无法通过pyjp
进行测试,请从Project description进行猜测,然后尝试:
pyjq.all('.base[].base[] | {"uid": .uid, "item-number":.item-number}', emp_data_parsed)
像这样环绕您的JSON:
for rec in emp_data_parsed['base'][0]['base']:
print("{}".format(rec))
输出:
{'uid': '2da21174-0af8-4b5b-b02e-2957a24d70e1', 'item-number': 1}, ... (omitted for brevity) {'uid': 'fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba', 'item-number': 2}, ... (omitted for brevity) {'uid': '4ecf6450-7307-466c-bf19-663ba2fbaf69', 'item-number': 3}, ... (omitted for brevity)
输出准备就绪,可供csv.DictWriter
读取csv.DictWriter,例如:
import csv
with open('test.csv', 'w') as csv_file:
fieldnames = ['uid', 'item-number']
writer = csv.DictWriter(csv_file, fieldnames=fieldnames, extrasaction='ignore')
writer.writeheader()
for record in emp_data_parsed['base'][0]['base']:
writer.writerow(record)
输出:
uid,name 2da21174-0af8-4b5b-b02e-2957a24d70e1,None fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba,Tommy 4ecf6450-7307-466c-bf19-663ba2fbaf69,Sam
答案 1 :(得分:1)
有趣的是,我知道jq
,Python包装器是一个好主意。
我使用jq
进行数据处理。还有grep
,head
等:)当我需要使用CSV时,我宁愿只编写一次CSV到JSONL(反之亦然)程序,然后将其用作外壳管道。
# to_csv.py
import csv, json, sys
rows = [json.loads(line) for line in sys.stdin]
all_keys = []
for row in rows:
for key in row.keys():
if key not in all_keys:
all_keys.append(key)
writer = csv.DictWriter(sys.stdout, fieldnames=all_keys, extrasaction='ignore')
writer.writeheader()
for row in rows:
writer.writerow(row)
用法(我不得不稍微修复example.json
):
$ cat example.json | jq -c '.base[].base[] | { uid, name }' | python3 to_csv.py
uid,name
2da21174-0af8-4b5b-b02e-2957a24d70e1,
fcc5a2c8-3a78-4cc5-9fd3-e7bd59eb36ba,Tommy
4ecf6450-7307-466c-bf19-663ba2fbaf69,Sam