从嵌套字典中构建分组数据帧的有效方法是什么。
代码片段
# Endcoding Description
encoding_dict = {"Age":{"Middle": 0,
"Senior": 1,
"Young": 2},
"Sex":{"F": 0,
"M": 1},
"BP":{"High": 0,
"Low": 1,
"Normal": 2},
"Cholesterol":{"High": 0,
"Normal": 1}}
# Step 1 : Create DataFrame
df_1 = pd.DataFrame({"Features": ["Age"]*3 + ["Sex"]*2 + ["BP"]*3 + ["Cholesterol"]*2,
"Categories":["Middle", "Senior", "Young", "F", "M", "High", "Low","Normal", "High", "Normal"],
"Encoding":[0, 1, 2, 0, 1, 0, 1, 2, 0, 1]})
# Step 2 : Grouped DataFrame
grouped = df_1.groupby(["Features","Categories"]).sum()
print(grouped)
输出
Encoding
Features Categories
Age Middle 0
Senior 1
Young 2
BP High 0
Low 1
Normal 2
Cholesterol High 0
Normal 1
Sex F 0
M 1
在不手动执行步骤 (1) 的情况下创建所需的嵌套字典分组数据框的有效方法是什么?
答案 0 :(得分:1)
构建框架构造函数然后添加轴名称的字典理解可以工作:
df = pd.DataFrame(
{'encoding': {(k, sub_k): v
for k, sub_d in encoding_dict.items()
for sub_k, v in sub_d.items()}}
).rename_axis(index=['Features', 'Categories'])
df
:
encoding
Features Categories
Age Middle 0
Senior 1
Young 2
BP High 0
Low 1
Normal 2
Cholesterol High 0
Normal 1
Sex F 0
M 1