数据框:
Name class section
A 5 c
B 3 a
C 4 b
字典:
dict={A:['Singing','dancing','Drawing'],C:['Gamming','Painting'],D:['Football','Basketball']}
预期输出:
Name Class Section Hobby
A 5 c Singing
dancing
Drawing
B 3 a
C 4 b Gaming
Painting
D Football
Basketball
我需要根据名称匹配来组合数据框和字典,并且在某些情况下,数据框不会具有该名称,在这种情况下我需要在数据框中添加额外的行,我需要在 python 中执行此操作。< /p>
答案 0 :(得分:1)
首先不要使用dict
作为字典变量,因为python代号。
对于在字典中添加缺失值表单键使用 Index.union
和 DataFrame.reindex
,然后对于新列使用 Series.map
然后 DataFrame.explode
df = df.set_index('Name')
df = df.reindex(df.index.union(d.keys(), sort=False)).rename_axis('Name').reset_index()
df['Hobby'] = df['Name'].map(d)
df = df.explode('Hobby')
print (df)
Name class section Hobby
0 A 5.0 c Singing
0 A 5.0 c dancing
0 A 5.0 c Drawing
1 B 3.0 a NaN
2 C 4.0 b Gamming
2 C 4.0 b Painting
3 D NaN NaN Football
3 D NaN NaN Basketball
最后如果需要重复设置为空字符串并替换 NaNs :
df.loc[df.index.duplicated(), ['Name','class','section']] = ''
df = df.fillna('').reset_index(drop=True)
print (df)
Name class section Hobby
0 A 5 c Singing
1 dancing
2 Drawing
3 B 3 a
4 C 4 b Gamming
5 Painting
6 D Football
7 Basketball