如果满足条件,则添加空白行

时间:2019-11-07 18:45:46

标签: python pandas dataframe

所以我有一个看起来像这样的数据框。有没有一种方法可以在每次描述为空时在上面添加空行?看起来像这样:

   Type     Date            Src       Description                    ST or OT
2    A      2019-06-10          AP  
3    A      2019-06-10          AP    Boston-Alliant Insurance Services     ST
5    B      2019-05-16          AP  
6    B      2019-05-16          AP    City of Cambridge                     ST
7    B      2019-05-16          AP    City of Cambridge                     OT
8    B      2019-08-20          AP    Jeffrey Soderquist                    OT
905  C      2019-05-01          PR  
906  C      2019-05-01          AP    Citibusiness Card                     ST
907  C      2019-07-26          AP    Pro Tool and Supply Inc               OT
908  D      2019-09-25          PR  
909  D      2019-09-25          PR    21/O'Leary                            ST
910  D      2019-09-26          PR    21/O'Leary                            ST

这就是我最后想要的:

   Type     Date            Src       Description                    ST or OT
2    A      2019-06-10          AP  
3    A      2019-06-10          AP    Boston-Alliant Insurance Services     ST
5
6
7    B      2019-05-16          AP  
8    B      2019-05-16          AP    City of Cambridge                     ST
9    B      2019-05-16          AP    City of Cambridge                     OT
10   B      2019-08-20          AP    Jeffrey Soderquist                    OT
905
906
907  C      2019-05-01          PR  
908  C      2019-05-01          AP    Citibusiness Card                     ST
909  C      2019-07-26          AP    Pro Tool and Supply Inc               OT
910
911
912  D      2019-09-25          PR  
913  D      2019-09-25          PR    21/O'Leary                            ST
914  D      2019-09-26          PR    21/O'Leary                            ST

3 个答案:

答案 0 :(得分:1)

您的预期结果与您的描述不符。第一行为空白,但上方没有空行。

这是一种方法:

blanks = df[df['Description'].str.strip() == ''] \
            .assign(**{
                'Type': '',
                'Date': pd.NaT,
                'Src': '',
                'ST or OT': ''
            })
blanks.index -= 1

df = pd.concat([df, blanks]).sort_index()

结果:

    Type       Date Src                        Description  ST or OT

1               NaT                                                 
2      A 2019-06-10  AP                                          NaN
3      A 2019-06-10  AP  Boston-Alliant Insurance Services        ST
4               NaT                                                 
5      B 2019-05-16  AP                                          NaN
6      B 2019-05-16  AP                  City of Cambridge        ST
7      B 2019-05-16  AP                  City of Cambridge        OT
8      B 2019-08-20  AP                 Jeffrey Soderquist        OT
904             NaT                                                 
905    C 2019-05-01  PR                                          NaN
906    C 2019-05-01  AP                  Citibusiness Card        ST
907    C 2019-07-26  AP            Pro Tool and Supply Inc        OT
907             NaT                                                 
908    D 2019-09-25  PR                                          NaN
909    D 2019-09-25  PR                         21/O'Leary        ST
910    D 2019-09-26  PR                         21/O'Leary        ST

答案 1 :(得分:0)

您可以使用:


for index, row in df.iterrrows():
   if df.loc[index,'Type'] != df.loc[index+1,'Type'] and not(pd.isna(df.iloc[index,'Type'])) and not(pd.isna(df.iloc[index+1,'Type']))
      df.loc[index+1] = pd.Series([np.nan,np.nan, np.nan, np.nan])
      df.loc[index+2] = pd.Series([np.nan,np.nan, np.nan, np.nan])

答案 2 :(得分:0)

您可以使用np.insert,它使您可以通过索引位置添加值,这在使用列表时非常灵活。

indices = df.loc[df['Description'] == ' '].index.tolist() # get your blank rows.
rows_ = dict.fromkeys(df.columns.tolist(),'') # create an empty df based on your cols.

然后,我们只将来自rows_variable的值分配为值,将键作为列分配给您选择的索引位置。

df_new = pd.DataFrame(np.insert(df.values, [x -1 for x in indices],
                   values=list(rows_.values()), 
                   axis=0),columns=rows_.keys())
    print(df_new)


 Type        Date Src                        Description ST or OT
0     A  10/06/2019  AP                                          ST
1                                                                  
2     A  10/06/2019  AP  Boston-Alliant Insurance Services         
3     B  16/05/2019  AP                                          ST
4     B  16/05/2019  AP                  City of Cambridge       OT
5     B  16/05/2019  AP                  City of Cambridge         
6                                                                  
7     B  20/08/2019  AP                 Jeffrey Soderquist       OT
8     C  01/05/2019  PR                                            
9     C  01/05/2019  AP                  Citibusiness Card       ST
10                                                                 
11    C  26/07/2019  AP            Pro Tool and Supply Inc       OT
12    D  25/09/2019  PR                                            
13    D  25/09/2019  PR                         21/O'Leary       ST
14                                                                 
15    D  26/09/2019  PR                         21/O'Leary       ST