我有一个具有以下结构的json文件:
[{
"field1": "first",
"field2": "d",
"id": 35,
"features": [
{
"feature_id": 2,
"value": 6
},
{
"feature_id": 3,
"value": 8.5
},
{
"feature_id":5,
"value":6.7
},
{
"feature_id":10,
"value": 3.4
}
],
"time": "2018-11-17"
},
{
"field1": "second",
"field2": "b",
"id": 36,
"features": [
{
"feature_id": 3,
"value": 5.4
},
{
"feature_id": 10,
"value": 9.5
},
],
"time": "2018-11-17"
}]
我可以将其更改为Pandas Dataframe
import json
import pandas as pd
with open(file) as json_data:
data = json.load(json_data)
df=pd.DataFrame(data)
,但是一列在列表中具有嵌套的字典,因此features列包含具有字典列表的列。我想整理全部数据,因此最终表应如下所示。感谢任何帮助吗?
答案 0 :(得分:0)
Json被[]
更改为有效:
data = [{
"field1": "first",
"field2": "d",
"id": 35,
"features": [
{
"feature_id": 2,
"value": 6
},
{
"feature_id": 3,
"value": 8.5
},
{
"feature_id":5,
"value":6.7
},
{
"feature_id":10,
"value": 3.4
}
],
"time": "2018-11-17"
},
{
"field1": "second",
"field2": "b",
"id": 36,
"features": [
{
"feature_id": 3,
"value": 5.4
},
{
"feature_id": 10,
"value": 9.5
},
],
"time": "2018-11-17"
}]
然后循环每个项目并为features
创建新的字典对象,最后一遍是DataFrame
的构造者:
L = []
for x in data:
d = {}
for k, v in x.items():
if k == 'features':
for y in v:
d[f"feature_id_{y['feature_id']}"] = y['value']
else:
d[k] = v
L.append(d)
df = pd.DataFrame(L)
print (df)
feature_id_10 feature_id_2 feature_id_3 feature_id_5 field1 field2 id \
0 3.4 6.0 8.5 6.7 first d 35
1 9.5 NaN 5.4 NaN second b 36
time
0 2018-11-17
1 2018-11-17
答案 1 :(得分:0)
对于具有嵌套键的扁平化JSON对象成单个Dict,请使用以下函数。
def flatten_json(nested_json):
"""
Flatten json object with nested keys into a single level.
Args:
nested_json: A nested json object.
Returns:
The flattened json object if successful, None otherwise.
"""
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(nested_json)
return out
希望此功能对您有帮助。