我喜欢检查一列,如果该列的日期与下一个相同,则合并备注列。日期行可能超过2个。
我目前的代码停留在这个阶段:
df = {'date': ['02-Jan','02-Jan','03-Jan','03-Jan','03-Jan','04-Jan','05-Jan'],
'remarks':['a','b','c','d','e','f','g']}
df = pd.DataFrame(df)
for eachRow in range(len(df)):
print("row" , eachRow)
try:
if(df['date'][eachRow] == df['date'][eachRow + 1]):
df['remarks'][eachRow] = df['remarks'][eachRow] + df['remarks'][eachRow + 1]
print('drop', eachRow+1)
df = df.drop(eachRow + 1)
print(df)
except:
print(df)
我的输出电流是。我注意到当我有两个以上具有相同日期的连续行,并且当我删除第3行时,我无法检查第2行和第4行,因为我的eachRow指针已移至第3行,并且第3行没有可比较的内容。如果我选择不删除下一行,则重复的行将带有错误的注释。我该怎么办?
row 0
drop 1
date remarks
0 02-Jan ab
2 03-Jan c
3 03-Jan d
4 03-Jan e
5 04-Jan f
6 05-Jan g
row 1
date remarks
0 02-Jan ab
2 03-Jan c
3 03-Jan d
4 03-Jan e
5 04-Jan f
6 05-Jan g
row 2
drop 3
date remarks
0 02-Jan ab
2 03-Jan cd
4 03-Jan e
5 04-Jan f
6 05-Jan g
row 3
date remarks
0 02-Jan ab
2 03-Jan cd
4 03-Jan e
5 04-Jan f
6 05-Jan g
row 4
row 5
row 6
date remarks
0 02-Jan ab
2 03-Jan cd
4 03-Jan e
5 04-Jan f
6 05-Jan g
答案 0 :(得分:1)
一个简单的更改即可保存它:
而不是删除下一行(eachRow+1
),而是删除当前行(eachRow
):
df = df.drop(eachRow)
;
同时,您必须注意,在删除当前行时,必须在下一行进行串联。因此,将行更改为:
df['remarks'][eachRow+1] = df['remarks'][eachRow] + df['remarks'][eachRow + 1]
df = {'date': ['02-Jan','02-Jan','03-Jan','03-Jan','03-Jan','04-Jan','05-Jan'],
'remarks':['a','b','c','d','e','f','g']}
df = pd.DataFrame(df)
for eachRow in range(len(df)):
print("row" , eachRow)
try:
if(df['date'][eachRow] == df['date'][eachRow + 1]):
df['remarks'][eachRow+1] = df['remarks'][eachRow] + df['remarks'][eachRow + 1]
print('drop', eachRow)
df = df.drop(eachRow)
print(df)
except:
print(df)