我有与此相似的数据框
status time text
1 12:25 some text
NaN NaN status 1 txt
NaN NaN s1
2 15:23 some text
NaN NaN status 2 txt
NaN NaN s2
我想按状态合并行,但是我不想丢失文本单元格,像这样。
status time text
1 12:25 some text status 1 txt s1
2 15:23 some text status 2 txt s2
我已经尝试过按这种状态进行分组,但是我丢失了文本单元格。
df = df.groupby("status")[["time", "text"]].first().reset_index()
答案 0 :(得分:2)
尝试:
df["grp"]=(~df.status.isna()|~df.time.isna()).cumsum()
df=df.groupby("grp").agg({"status": "first", "time": "first", "text": " ".join})
#optionally:
#df=df.groupby("grp").agg({"status": "first", "time": "first", "text": " ".join}).reset_index(drop=True)
输出:
status time text
grp
1 1.0 12:25 some text status 1 txt s1
2 2.0 15:23 some text status 2 txt s2