我有这个数据
import numpy as np
import pandas as pd
group = {'gender': ['male', 'female', 'female', 'male', 'female', 'male', 'male'],
'height': [175, 168, np.nan, 170, 167, np.nan, 190],
}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
df = pd.DataFrame(group, index=labels)
df2 = df.groupby('gender')['height'].mean()
我想用df2的平均值填充nan
答案 0 :(得分:2)
您可以将groupby
和transform
与mean
一起使用。然后fillna
及其结果系列。
means = df.groupby('gender')['height'].transform('mean')
df['height'] = df['height'].fillna(means)
print(df)
gender height
a male 175.000000
b female 168.000000
c female 167.500000
d male 170.000000
e female 167.000000
f male 178.333333
g male 190.000000
答案 1 :(得分:2)
代码
import pandas as pd
import numpy as np
group = {'gender': ['male', 'female', 'female', 'male', 'female', 'male', 'male'],
'height': [175, 168, np.nan, 170, 167, np.nan, 190],
}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
df = pd.DataFrame(group, index=labels)
df2 = df.groupby('gender')['height'].mean()
df['height'].fillna(df['gender'].map(df2), inplace=True)
# print(df2)
print(df)
输出
gender height
a male 175.000000
b female 168.000000
c female 167.500000
d male 170.000000
e female 167.000000
f male 178.333333
g male 190.000000