如何从熊猫数据框创建多层嵌套字典?

时间:2019-11-25 10:31:06

标签: python pandas dataframe

我正在尝试从pandas数据框创建多级嵌套字典-在下面的示例中,我想为每个邮政编码检索每种性别和年龄组合的薪水总和。 输出必须是Expected output注释中显示的字典。

from typing import NamedTuple, Sequence, Tuple

import pandas as pd

data = [
    ["tom", 22, "ab 11", "M", 5555],
    ["Rob", 22, "ab 11", "M", 9999],
    ["nick", 33, "ab 22", "M", 3333],
    ["juli", 18, "ab 11", "F", 2222],
]
people = pd.DataFrame(data, columns=["Name", "Age", "PostalCode", "Sex", "Salary"])

d = (
    people.groupby(["PostalCode", "Sex", "Age"])["Salary"]
    .apply(sum)
    .to_dict()
)

print(d)

# Expected output
print({"ab 11": {("M", 22): 15554, ("F", 18): 2222}, "ab 22": {("M", 33): 3333}})

1 个答案:

答案 0 :(得分:2)

只需稍微改变您的解决方案并使用其他字典理解

df = (
    people.groupby(["PostalCode", "Sex", "Age"])["Salary"]
          .sum()
          .unstack(0)
    )

d =  {col: df[col].dropna().to_dict() for col in df}

print(d)

Out[40]:
{'ab 11': {('F', 18): 2222.0, ('M', 22): 15554.0},
 'ab 22': {('M', 33): 3333.0}}