在Python中修改DataFrame

时间:2018-07-31 23:16:31

标签: python pandas dataframe

我想将df1中的原始数据修改为df2的形式

import pandas as pd

df1=pd.DataFrame([["20180105","abcdefg"],["","sdasdas"],["20180211","asdasfsd"],["","asdfg"],["","sdada"]],columns=["A","B"])

df2=pd.DataFrame([["20180105","abcdefgsdasdas"],["20180211","asdasfsdasdfgsdada"]],columns=["A","B"])

enter image description here

2 个答案:

答案 0 :(得分:2)

您可以groupby,并使用sum进行字符串连接:

df1.replace({'A':{'':np.nan}}).ffill().groupby('A', as_index=False).sum() 

          A                   B
0  20180105      abcdefgsdasdas
1  20180211  asdasfsdasdfgsdada

请注意,我用A替换了NaN列中的空白字符串,然后用ffill()进行了填充

答案 1 :(得分:2)

也可以使用agg + ''.join

g = (df1.A != '').cumsum()
df1.groupby(g, as_index=False).agg(''.join)

    A           B 
0   20180105    abcdefgsdasdas
1   20180211    asdasfsdasdfgsdada