所以我有一个看起来像这样的数据框。有没有一种方法可以在每次描述为空时在上面添加空行?看起来像这样:
Type Date Src Description ST or OT
2 A 2019-06-10 AP
3 A 2019-06-10 AP Boston-Alliant Insurance Services ST
5 B 2019-05-16 AP
6 B 2019-05-16 AP City of Cambridge ST
7 B 2019-05-16 AP City of Cambridge OT
8 B 2019-08-20 AP Jeffrey Soderquist OT
905 C 2019-05-01 PR
906 C 2019-05-01 AP Citibusiness Card ST
907 C 2019-07-26 AP Pro Tool and Supply Inc OT
908 D 2019-09-25 PR
909 D 2019-09-25 PR 21/O'Leary ST
910 D 2019-09-26 PR 21/O'Leary ST
这就是我最后想要的:
Type Date Src Description ST or OT
2 A 2019-06-10 AP
3 A 2019-06-10 AP Boston-Alliant Insurance Services ST
5
6
7 B 2019-05-16 AP
8 B 2019-05-16 AP City of Cambridge ST
9 B 2019-05-16 AP City of Cambridge OT
10 B 2019-08-20 AP Jeffrey Soderquist OT
905
906
907 C 2019-05-01 PR
908 C 2019-05-01 AP Citibusiness Card ST
909 C 2019-07-26 AP Pro Tool and Supply Inc OT
910
911
912 D 2019-09-25 PR
913 D 2019-09-25 PR 21/O'Leary ST
914 D 2019-09-26 PR 21/O'Leary ST
答案 0 :(得分:1)
您的预期结果与您的描述不符。第一行为空白,但上方没有空行。
这是一种方法:
blanks = df[df['Description'].str.strip() == ''] \
.assign(**{
'Type': '',
'Date': pd.NaT,
'Src': '',
'ST or OT': ''
})
blanks.index -= 1
df = pd.concat([df, blanks]).sort_index()
结果:
Type Date Src Description ST or OT
1 NaT
2 A 2019-06-10 AP NaN
3 A 2019-06-10 AP Boston-Alliant Insurance Services ST
4 NaT
5 B 2019-05-16 AP NaN
6 B 2019-05-16 AP City of Cambridge ST
7 B 2019-05-16 AP City of Cambridge OT
8 B 2019-08-20 AP Jeffrey Soderquist OT
904 NaT
905 C 2019-05-01 PR NaN
906 C 2019-05-01 AP Citibusiness Card ST
907 C 2019-07-26 AP Pro Tool and Supply Inc OT
907 NaT
908 D 2019-09-25 PR NaN
909 D 2019-09-25 PR 21/O'Leary ST
910 D 2019-09-26 PR 21/O'Leary ST
答案 1 :(得分:0)
您可以使用:
for index, row in df.iterrrows():
if df.loc[index,'Type'] != df.loc[index+1,'Type'] and not(pd.isna(df.iloc[index,'Type'])) and not(pd.isna(df.iloc[index+1,'Type']))
df.loc[index+1] = pd.Series([np.nan,np.nan, np.nan, np.nan])
df.loc[index+2] = pd.Series([np.nan,np.nan, np.nan, np.nan])
答案 2 :(得分:0)
您可以使用np.insert,它使您可以通过索引位置添加值,这在使用列表时非常灵活。
indices = df.loc[df['Description'] == ' '].index.tolist() # get your blank rows.
rows_ = dict.fromkeys(df.columns.tolist(),'') # create an empty df based on your cols.
然后,我们只将来自rows_variable的值分配为值,将键作为列分配给您选择的索引位置。
df_new = pd.DataFrame(np.insert(df.values, [x -1 for x in indices],
values=list(rows_.values()),
axis=0),columns=rows_.keys())
print(df_new)
Type Date Src Description ST or OT
0 A 10/06/2019 AP ST
1
2 A 10/06/2019 AP Boston-Alliant Insurance Services
3 B 16/05/2019 AP ST
4 B 16/05/2019 AP City of Cambridge OT
5 B 16/05/2019 AP City of Cambridge
6
7 B 20/08/2019 AP Jeffrey Soderquist OT
8 C 01/05/2019 PR
9 C 01/05/2019 AP Citibusiness Card ST
10
11 C 26/07/2019 AP Pro Tool and Supply Inc OT
12 D 25/09/2019 PR
13 D 25/09/2019 PR 21/O'Leary ST
14
15 D 26/09/2019 PR 21/O'Leary ST