我想将df1
中的原始数据修改为df2
的形式
import pandas as pd
df1=pd.DataFrame([["20180105","abcdefg"],["","sdasdas"],["20180211","asdasfsd"],["","asdfg"],["","sdada"]],columns=["A","B"])
df2=pd.DataFrame([["20180105","abcdefgsdasdas"],["20180211","asdasfsdasdfgsdada"]],columns=["A","B"])
答案 0 :(得分:2)
您可以groupby
,并使用sum
进行字符串连接:
df1.replace({'A':{'':np.nan}}).ffill().groupby('A', as_index=False).sum()
A B
0 20180105 abcdefgsdasdas
1 20180211 asdasfsdasdfgsdada
请注意,我用A
替换了NaN
列中的空白字符串,然后用ffill()
进行了填充
答案 1 :(得分:2)
也可以使用agg
+ ''.join
g = (df1.A != '').cumsum()
df1.groupby(g, as_index=False).agg(''.join)
A B
0 20180105 abcdefgsdasdas
1 20180211 asdasfsdasdfgsdada