我正在研究如何扁平化字典中的嵌套列表,该字典使用python嵌套在列表中。
以下示例:
[
{
"id": 8,
"category": {
"id": 0,
"name": "lion"
},
"name": "Leon",
"photoUrls": [
"123",
"444",
],
"tags": [
{
"id": 1,
"name": "TagLion"
},
{
"id": 2,
"name": "KingOfTheJungle"
}
],
},
{
"id": 83,
"category": {
"id": 0,
"name": "dog UPDATED"
},
"name": "Buff",
"photoUrls": [
"333",
],
"tags": [
{
"id": 1,
"name": "TagNumber1UPDATED"
},
{
"id": 2,
"name": "DogWithStickUPDATED"
}
],
}
]
从上面的示例(这是API的返回),我想将输出写入csv。但是这里的捕获是在“标签” 上,它是一个嵌套列表。我希望将上述结果展平为csv格式,如下所示:
id | category | name | photoUrls | tags
8 |{'id': 0, 'name': 'dog UPDATED'}| Leon | 123 | {'id': 1, "name": "TagLion"}
83 |{'id': 0, 'name': 'dog UPDATED'}| Buff | 333 | {"id": 1,"name": "TagNumber1UPDATED"}
83 |{'id': 0, 'name': 'dog UPDATED'}| Buff | 333 | {"id": 2,"name": "name": "DogWithStickUPDATED"}
如何使用python做到这一点?希望将其设置为配置,并在加载到csv时,python将查找此配置以展平数组“ tags”
编辑: 还要使 photourls 列变平,它是一个数组。如下所示,通过管道化而不是拆分来实现。
id | category | name | photoUrls | tags
8 |{'id': 0, 'name': 'dog UPDATED'}| Leon | 123 |444 | {'id': 1, "name": "TagLion"}
8 |{'id': 0, 'name': 'dog UPDATED'}| Leon | 123 | {'id': 1, "name": "TagLion"}
8 |{'id': 0, 'name': 'dog UPDATED'}| Leon | 123 | {'id': 2, "name": "KingOfTheJungle"}
83 |{'id': 0, 'name': 'dog UPDATED'}| Buff | 333 | {"id": 1,"name": "TagNumber1UPDATED"}
83 |{'id': 0, 'name': 'dog UPDATED'}| Buff | 333 | {"id": 2,"name": "name": "DogWithStickUPDATED"}
答案 0 :(得分:1)
您可以使用神奇的pandas
软件包的力量:
tags
值:代码:
import pandas as pd
data = [] # your list is here
df = pd.DataFrame(data)
# expand 'tags' column into multiple rows
tags = df.apply(lambda x: pd.Series(x['tags']), axis=1).stack().reset_index(level=1, drop=True)
tags.name = 'tags'
df = df.drop('tags', axis=1).join(tags)
print(df)
打印:
category id name photoUrls tags
0 {'id': 0, 'name': 'lion'} 8 Leon [123] {'id': 1, 'name': 'TagLion'}
0 {'id': 0, 'name': 'lion'} 8 Leon [123] {'id': 2, 'name': 'KingOfTheJungle'}
1 {'id': 0, 'name': 'dog UPDATED'} 83 Buff [333] {'id': 1, 'name': 'TagNumber1UPDATED'}
1 {'id': 0, 'name': 'dog UPDATED'} 83 Buff [333] {'id': 2, 'name': 'DogWithStickUPDATED'}
要转储为CSV,可以使用.to_csv()
method。
您还可以将“扩展列”逻辑提取到单独的方法中并重复使用:
def expand_column(df, column_name):
c = df.apply(lambda x: pd.Series(x[column_name]), axis=1).stack().reset_index(level=1, drop=True)
c.name = column_name
return df.drop(column_name, axis=1).join(c)
用法:
df = pd.DataFrame(data)
df = expand_column(df, 'tags')
答案 1 :(得分:1)
您可以使用嵌套的理解:
import csv
d = [{'id': 8, 'category': {'id': 0, 'name': 'lion'}, 'name': 'Leon', 'photoUrls': ['123'], 'tags': [{'id': 1, 'name': 'TagLion'}, {'id': 2, 'name': 'KingOfTheJungle'}]}, {'id': 83, 'category': {'id': 0, 'name': 'dog UPDATED'}, 'name': 'Buff', 'photoUrls': ['333'], 'tags': [{'id': 1, 'name': 'TagNumber1UPDATED'}, {'id': 2, 'name': 'DogWithStickUPDATED'}]}]
new_d = [[i['id'], i['category'], i['name'], *i["photoUrls"], c] for i in d for c in i['tags']]
with open('results.csv', 'w') as f:
write = csv.writer(f)
write.writerows([['id', 'category', 'name', 'photoUrls', 'tags'], *new_d])
输出:
id,category,name,photoUrls,tags
8,"{'id': 0, 'name': 'lion'}",Leon,123,"{'id': 1, 'name': 'TagLion'}"
8,"{'id': 0, 'name': 'lion'}",Leon,123,"{'id': 2, 'name': 'KingOfTheJungle'}"
83,"{'id': 0, 'name': 'dog UPDATED'}",Buff,333,"{'id': 1, 'name': 'TagNumber1UPDATED'}"
83,"{'id': 0, 'name': 'dog UPDATED'}",Buff,333,"{'id': 2, 'name': 'DogWithStickUPDATED'}"