我有以下要使用python扁平化的词典列表。数据最初来自xero,如下所示:
这是我使用API提取的示例数据:
my_dict = [{'RowType': 'Section', 'Title': 'Income', 'Rows': []},{'RowType': 'Section', 'Title': 'Income from Rents', 'Rows': []},
{'RowType': 'Section',
'Title': 'Rent Received',
'Rows': [{'RowType': 'Row',
'Cells': [{'Value': 'Contract Rent',
'Attributes': [{'Value': '5',
'Id': 'account'},
{'Value': '5', 'Id': 'groupID'}]},
{'Value': '721093.92',
'Attributes': [{'Value': '5',
'Id': 'account'},
{'Value': '5', 'Id': 'groupID'}]}]},
{'RowType': 'Row',
'Cells': [{'Value': 'Rent - Carparks',
'Attributes': [{'Value': '95',
'Id': 'account'}]},
{'Value': '3523.33',
'Attributes': [{'Value': '95',
'Id': 'account'}]}]},
{'RowType': 'Row',
'Cells': [{'Value': 'Vacant Tenancies',
'Attributes': [{'Value': '53',
'Id': 'account'}]},
{'Value': '-22226.50',
'Attributes': [{'Value': '53',
'Id': 'account'}]}]},
{'RowType': 'SummaryRow',
'Cells': [{'Value': 'Total Rent Received'}, {'Value': '702390.75'}]}]},
{'RowType': 'Section',
'Title': 'Rent Reductions',
'Rows': [{'RowType': 'Row',
'Cells': [{'Value': 'COVID-19 Rent reduction',
'Attributes': [{'Value': '40',
'Id': 'account'}]},
{'Value': '-132478.03',
'Attributes': [{'Value': '40',
'Id': 'account'}]}]},
{'RowType': 'Row',
'Cells': [{'Value': 'Rent Holiday',
'Attributes': [{'Value': '4d',
'Id': 'account'}]},
{'Value': '-14451.58',
'Attributes': [{'Value': '4d',
'Id': 'account'}]}]},
{'RowType': 'SummaryRow',
'Cells': [{'Value': 'Total Rent Reductions'}, {'Value': '-146929.61'}]}]}]
所需的输出如下:
Name Amount Hierarchy_level_3 Hierarchy_level_1 Hierarchy_level_2
0 Contract Rent 721093.92 Rent Received Income Income from Rents
1 Rent - Carparks 3523.33 Rent Receive Income Income from Rents
2 Vacant Tenancies -22226.50 Rent Received Income Income from Rents
3 Total Rent Received 702390.75
4 COVID-19 Rent reduction -132478.03 Rent Reduction Income Income from Rents
. . . . . .
. . . . . .
有人可以帮助我实现这一目标吗?这里的示例数据是我从api获取的格式。不确定如何展平此文件。我对Python较新。
答案 0 :(得分:2)
假设示例中第Hierarchy_level_3
行的4
是Rent Received
而不是Rent Reduction
,并且示例中具有第4级层次结构,这是一个解决方案。我添加了级别编号和级别名称,因为我认为它们可能比“层次结构级别”更有用,但是可以随时删除
import pandas as pd
hierarchy = {f'Hierarchy_level_{i+1}': d['Title'] for i, d in enumerate(my_dict)}
all_data = []
for level, d in enumerate(my_dict):
for row in d['Rows']:
cells = row['Cells']
all_data.append({
'Name': cells[0]['Value'],
'Amount': cells[1]['Value'],
'Level': level,
'Level_name': hierarchy[f'Hierarchy_level_{level+1}'],
**hierarchy
})
df = pd.DataFrame(all_data)
输出:
Name Amount Level Level_name Hierarchy_level_1 Hierarchy_level_2 Hierarchy_level_3 Hierarchy_level_4
0 Contract Rent 721093.92 2 Rent Received Income Income from Rents Rent Received Rent Reductions
1 Rent - Carparks 3523.33 2 Rent Received Income Income from Rents Rent Received Rent Reductions
2 Vacant Tenancies -22226.50 2 Rent Received Income Income from Rents Rent Received Rent Reductions
3 Total Rent Received 702390.75 2 Rent Received Income Income from Rents Rent Received Rent Reductions
4 COVID-19 Rent reduction -132478.03 3 Rent Reductions Income Income from Rents Rent Received Rent Reductions
5 Rent Holiday -14451.58 3 Rent Reductions Income Income from Rents Rent Received Rent Reductions
6 Total Rent Reductions -146929.61 3 Rent Reductions Income Income from Rents Rent Received Rent Reductions
-编辑 由于只需要3个层次级别:
import pandas as pd
hierarchy = {f'Hierarchy_level_{i+1}': d['Title'] for i, d in enumerate(my_dict)}
all_data = []
for level, d in enumerate(my_dict):
for row in d['Rows']:
cells = row['Cells']
all_data.append({
'Name': cells[0]['Value'],
'Amount': cells[1]['Value'],
'Hierarchy_level_1': hierarchy[f'Hierarchy_level_1'],
'Hierarchy_level_2': hierarchy[f'Hierarchy_level_2'],
'Hierarchy_level_3': hierarchy[f'Hierarchy_level_{level+1}'],
})
df = pd.DataFrame(all_data)
输出:
Name Amount Hierarchy_level_1 Hierarchy_level_2 Hierarchy_level_3
0 Contract Rent 721093.92 Income Income from Rents Rent Received
1 Rent - Carparks 3523.33 Income Income from Rents Rent Received
2 Vacant Tenancies -22226.50 Income Income from Rents Rent Received
3 Total Rent Received 702390.75 Income Income from Rents Rent Received
4 COVID-19 Rent reduction -132478.03 Income Income from Rents Rent Reductions
5 Rent Holiday -14451.58 Income Income from Rents Rent Reductions
6 Total Rent Reductions -146929.61 Income Income from Rents Rent Reductions