合并pandas数据框中的2列,用前一个值填充NaN

时间:2017-05-17 11:36:17

标签: python pandas dataframe

我有一个数据框:

         State                           RegionName
0      Alabama                              Alabama
1          NaN                               Auburn
2          NaN                             Florence
3          NaN                         Jacksonville
4          NaN                           Livingston
5          NaN                           Montevallo
6          NaN                                 Troy
7          NaN                           Tuscaloosa
8          NaN                             Tuskegee
9       Alaska                               Alaska
10         NaN                            Fairbanks
11     Arizona                              Arizona
12         NaN                            Flagstaff
13         NaN                                Tempe
14         NaN                               Tucson

我该如何返回

DataFrame([["Alabama", "Auburn"], 
           ["Alabama", "Florence"], .
            .., 
           ["Alaska", "Fairbanks"],  
           ["Arizona", "Flagstaff"],  
           ...], columns=["State", "RegionName"])

所以所有的值都很好地合并了吗?

我曾尝试过:df['State'] = df['State'].apply(lambda x: df['RegionName'])但它没有为新名称开始时为RegionName分配新状态的逻辑。

2 个答案:

答案 0 :(得分:1)

需要ffill

df['State'] = df['State'].ffill()
print (df)
      State    RegionName
0   Alabama       Alabama
1   Alabama        Auburn
2   Alabama      Florence
3   Alabama  Jacksonville
4   Alabama    Livingston
5   Alabama    Montevallo
6   Alabama          Troy
7   Alabama    Tuscaloosa
8   Alabama      Tuskegee
9    Alaska        Alaska
10   Alaska     Fairbanks
11  Arizona       Arizona
12  Arizona     Flagstaff
13  Arizona         Tempe
14  Arizona        Tucson

答案 1 :(得分:0)

你可以试试fillna。

df=pd.DataFrame([["Alabama", "Auburn"],
               [np.nan, "Florence"],
               [np.nan, "Fairbanks"],
               ["Arizona", "Flagstaff"]], columns=["State", "RegionName"])
df
Out[94]: 
     State RegionName
0  Alabama     Auburn
1      NaN   Florence
2      NaN  Fairbanks
3  Arizona  Flagstaff


df.fillna(method='ffill')
Out[95]: 
     State RegionName
0  Alabama     Auburn
1  Alabama   Florence
2  Alabama  Fairbanks