大家好,我有3个数据帧:
df1
LC_REF Category vals
0 DT 17 1C WM dog
1 DT 17 1C WH foo, bat
2 DT 17 1C WP red, steam
df2
LC_REF Category vals
0 DT 17 1C WM cat
1 DT 17 1C WH sea, bat
df3
LC_REF Category vals
0 DT 17 1C WM turn
我想知道是否有任何方法可以填写所有没有WM,WH,WP的数据帧,并在'类别'列并插入缺少的类别:
df1
LC_REF Category vals
0 DT 17 1C WM dog
1 DT 17 1C WH foo, bat
2 DT 17 1C WP red, steam
df2
LC_REF Category vals
0 DT 17 1C WM cat
1 DT 17 1C WH sea, bat
2 DT 17 1C WP NaN
df3
LC_REF Category vals
0 DT 17 1C WM turn
1 DT 17 1C WH NaN
2 DT 17 1C WP NaN
我的尝试:
if df.loc[:, df.Category.isin(['WM', 'WH','WP']).count() == 3 :
continue
else:
???
我知道我需要涉及布尔掩码,但我不太确定如何最好地执行它。
答案 0 :(得分:1)
df2.index=df2.Category
df2=df2.reindex(['WM','WH','WP'])
df2['LC_REF']=df2[['LC_REF']].ffill()
df2.Category=df2.index
LC_REF Category vals
Category
WM 0 DT 17 1C WM cat
WH 1 DT 17 1C WH sea, bat
WP 1 DT 17 1C WP NaN
这是使用pd.concat, stack, unstack
DF=pd.concat([df1,df2],axis=0,keys=['df1','df2']).reset_index()
DF=DF.groupby(["level_0","Category"]).agg({'LC_REF':'sum','vals':'sum'}).unstack('Category').stack('Category', dropna=False)
DF['LC_REF'].ffill(inplace=True)
DF
Out[696]:
LC_REF vals
level_0 Category
df1 WH 1 DT 17 1C foo, bat
WM 0 DT 17 1C dog
WP 2 DT 17 1C red, steam
df2 WH 1 DT 17 1C sea, bat
WM 0 DT 17 1C cat
WP 0 DT 17 1C None
PS:使用df1
DF.loc['df1']
进行切片
与NaN
和None
不同,您可以找到here