Question

我有2个数据框。

在df1中，我有很多NaN，我想用df2中的值代替。 df2中的值数量与df1中的NaN数量相同。

我试图加入，合并和创建周期，但没有成功。

谢谢！

pd.Dataframe 1
0          NaN
1        240.0
2        229.0
3       1084.0
4       2078.0
        ....
Name: Healthcare_1, Length: 9999, dtype: float64

pd.Dataframe 2
0        830.0
6        100.0
7        100.0
8        830.0
9       1046.0
         ...  
Name: Healthcare_1, Length: 4797, dtype: float64

Answer 1

在我的回答中，我假设DataFrame1中发生NAN的行与DataFrame2中需要替换这些NAN的行具有相同的索引。

加载以下模块：

import pandas as pd
import numpy as np

我们有两个示例DataFrames：

df1 = pd.DataFrame({'c1': [np.nan, 240, np.nan, 1084, 2078]})
df2 = pd.DataFrame({'c1': [830, 100, 100, 830, 1046]}, index=[0,2,7,8,9])

确定df1中出现NAN的索引：

ind = list(np.where(df1['c1'].isnull()))[0]

检查这些索引在df2中的位置。这应该给出 array（[True，True，False，False，False]）：

df2.index.isin(list(ind))

将df1中的值替换为索引ind中的df2中的值：

df1[df1.index.isin(ind)] = df2[df2.index.isin(ind)]

Answer 2

解决方案1 ：使用.update()将df1中的nan值替换为df2中的相应值：

df1 = pd.Series([np.nan, 240, 229, 1084, 2078])
df2 = pd.Series([830, 100, 100, 830, 1046], index=[0, 6, 7, 8, 9])

df1.update(df2)

解决方案2 ：您也可以使用.combine_first()用第二个数据帧的值填充第一个数据帧的np.nan值：

df1.combine_first(df2).iloc[df1.index]

结果数据框：

根据另一个数据框python熊猫替换列值？（初学者）

2 个答案:

根据另一个数据框python熊猫替换列值？ （初学者）

2 个答案:

根据另一个数据框python熊猫替换列值？（初学者）