将虚拟信息插入数据帧

时间:2017-08-04 13:33:16

标签: python python-3.x pandas dataframe

大家好,我有3个数据帧:

df1
   LC_REF   Category    vals
0 DT 17 1C     WM       dog
1 DT 17 1C     WH       foo, bat
2 DT 17 1C     WP       red, steam

df2
   LC_REF   Category    vals
0 DT 17 1C     WM       cat
1 DT 17 1C     WH       sea, bat

df3
   LC_REF   Category    vals
0 DT 17 1C     WM       turn

我想知道是否有任何方法可以填写所有没有WM,WH,WP的数据帧,并在'类别'列并插入缺少的类别:

df1
   LC_REF   Category    vals
0 DT 17 1C     WM       dog
1 DT 17 1C     WH       foo, bat
2 DT 17 1C     WP       red, steam

df2
   LC_REF   Category    vals
0 DT 17 1C     WM       cat
1 DT 17 1C     WH       sea, bat
2 DT 17 1C     WP       NaN

df3
   LC_REF   Category    vals
0 DT 17 1C     WM       turn
1 DT 17 1C     WH       NaN
2 DT 17 1C     WP       NaN

我的尝试:

if df.loc[:, df.Category.isin(['WM', 'WH','WP']).count() == 3 :
    continue
else:
    ???

我知道我需要涉及布尔掩码,但我不太确定如何最好地执行它。

1 个答案:

答案 0 :(得分:1)

df2.index=df2.Category
df2=df2.reindex(['WM','WH','WP'])
df2['LC_REF']=df2[['LC_REF']].ffill()
df2.Category=df2.index

              LC_REF Category      vals
Category                               
WM        0 DT 17 1C       WM       cat
WH        1 DT 17 1C       WH  sea, bat
WP        1 DT 17 1C       WP       NaN

这是使用pd.concat, stack, unstack

的另一种解决方案
DF=pd.concat([df1,df2],axis=0,keys=['df1','df2']).reset_index()
DF=DF.groupby(["level_0","Category"]).agg({'LC_REF':'sum','vals':'sum'}).unstack('Category').stack('Category', dropna=False)
DF['LC_REF'].ffill(inplace=True)


DF
Out[696]: 
                      LC_REF        vals
level_0 Category                        
df1     WH        1 DT 17 1C    foo, bat
        WM        0 DT 17 1C         dog
        WP        2 DT 17 1C  red, steam
df2     WH        1 DT 17 1C    sea, bat
        WM        0 DT 17 1C         cat
        WP        0 DT 17 1C        None

PS:使用df1

DF.loc['df1']进行切片

NaNNone不同,您可以找到here