仅当两个单元格均可用时,两个熊猫数据帧的两个平均值如何计算?

时间:2020-11-09 15:57:02

标签: python pandas dataframe

我有2个这样的数据框:

df1 = pd.DataFrame({
    "a": [0.5,0.5,0.5],
    "b": [0.5,0.5,0.5],
    "c": [0.5,0.5,0.5]
}, index = ["a", "b", "c"])
df1

enter image description here

df2 = pd.DataFrame({
    "a": [0.3,0.3],
    "b": [0.3,0.3],
}, index = ["a", "b"])
df2

enter image description here

df2始终是df1的子集。

如果df2也包含df1的单元格(相同的索引和列),我想计算两者之间的加权平均值。如果不是,则df1的值应保持不变。

我可以这样:

average = pd.DataFrame()

for r in df1.columns:
    for c in df1.columns:
        try:
            average.loc[r, c] = 0.4 * df1.loc[r, c] + 0.6 * df2.loc[r, c]
        except KeyError:
            average.loc[r, c] = df1.loc[r, c]
        
average

enter image description here

但是,对于较大的数据帧,这会花费很长时间-是否有更快的方法?

非常感谢!

1 个答案:

答案 0 :(得分:1)

尝试fillna

(df1 * 0.4 + df2 * 0.6).fillna(df1)

输出:

      a     b    c
a  0.38  0.38  0.5
b  0.38  0.38  0.5
c  0.50  0.50  0.5