熊猫数据透视表转换为 defaultdict(list)

时间:2021-05-20 10:27:18

标签: python pandas defaultdict

我使用熊猫数据透视表。我需要将数据从数据透视表转换为 defaultdict(list)

这是我要字典 (df.to_dict()) 的数据透视表数据:

{
    'count': {
        ('formation1', 'position1'): 2,
        ('formation1', 'position2'): 1
    },
    'personData.employeeContract.edges': {
        ('formation1','position1'): 1,
        ('formation1', 'postition2'): 0
    },
    'total_count': {
        ('formation1', 'position1'): 2,
        ('formation1', 'position2'): 1
    },
    'count_with_contract': {
        ('formation1', 'position1'): 1,
        ('formation1', 'position2'): 0
    },
    'percent': {
        ('formation1', 'position1'): 0.5,
        ('formation1', 'position2'): 0.0
    }
}

我需要将数据从上面传输到下面:

{
    'formation1': [{
            'position1': {
                'total_count': 2.0
                'count_with_contract': 1.0
                'percent': 0.0
                
            } {
                'position2': {
                    'total_count': 1.0
                    'count_with_contract': 0.0
                    'percent': 0.0
                }
            ]

        }
    }

我该怎么做?

1 个答案:

答案 0 :(得分:1)

首先过滤需要的列名称,然后在字典理解中创建嵌套字典:

df1 = df[['total_count','count_with_contract','percent']]

d = {i:[g.reset_index(level=0, drop=True).to_dict('index')] for i,g in df1.groupby(level=0)}
print (d)
{'formation1':[{'position1': {'total_count': 2, 'count_with_contract': 1, 'percent': 0.5}, 
                'position2': {'total_count': 1, 'count_with_contract': 0, 'percent': 0.0}}]}

defaultdict 的解决方案(输出有点不同):

from collections import defaultdict

df1 = df[['total_count','count_with_contract','percent']]
print (df1)

d = defaultdict(list)
for (f, pos), x in df1.T.items():
    d[f].append({pos: x.to_dict()})
    
print (d)
defaultdict(<class 'list'>, 
{'formation1': [{'position1': {'total_count': 2.0, 'count_with_contract': 1.0, 'percent': 0.5}}, 
                {'position2': {'total_count': 1.0, 'count_with_contract': 0.0, 'percent': 0.0}}]})
相关问题