Question

我想将2列合并为1列并删除nan。

我有此数据：

     Name       A      A   
    Pikachu   2007    nan
    Pikachu   nan     2008
    Raichu    2007    nan
    Mew       nan     2018

预期结果：

     Name     Year   
    Pikachu   2007   
    Pikachu   2008   
    Raichu    2007   
    Mew       2008

我尝试的代码：

df['Year']= df['A','A'].astype(str).apply(''.join,1)

Answer 1

您可以执行此操作（两列不能使用相同的名称，它们必须不同，我的名称为A.1）

df['year']= df.A.combine_first(df['A.1']) #this gives new column 'year', then you have to drop your existing 2 columns.

df['year']= df.pop('A').combine_first(df.pop('A.1')) #this is remove the existing columns & give a new one directly.

OR

df.bfill(axis=1) #this fills NaN's in the first column

OR

df.ffill(axis=1) #this fills NaN's in the second column

Answer 2

我将使用ffill

df['Year']=df.ffill(1).iloc[:,-1]
df
      Name       A     A.1  Year
0  Pikachu  2007.0     NaN  2007
1  Pikachu     NaN  2008.0  2008
2   Raichu  2007.0     NaN  2007
3      Mew     NaN  2018.0  2018

Answer 3

我推荐ffill。这只是另一种方法。如果'nan'是NaN（即它不是字符串），而其他值是float，则可以在将其切片到数据帧[['A']]时使用sum。这会将所有名为A的列切成一个数据框

print(df[['A']])

        A       A
0  2007.0     NaN
1     NaN  2008.0
2  2007.0     NaN
3     NaN  2018.0

对此切片执行sum

df[['A']].sum(1).astype(int)

Out[62]:
0    2007
1    2008
2    2007
3    2018
dtype: int32

创建一个新的数据框

df_new = df[['Name']].assign(Year=df[['A']].sum(1).astype(int))

Out[67]:
      Name  Year
0  Pikachu  2007
1  Pikachu  2008
2   Raichu  2007
3      Mew  2018

如果所有A列都是字符串，请使用pd.to_numeric将它们转换为数字类型。

df[['A']].apply(pd.to_numeric, errors='coerce').sum(1).astype(int)

Out[97]:
0    2007
1    2008
2    2007
3    2018
dtype: int32

将2列合并为1个具有相同名称的列

3 个答案: