在Pandas中,我有一个这种类型的数据框:
value
SampleGroup sample
Group1 ref 18.1
smp1 NaN
smp2 20.3
smp3 30.0
smp4 23.8
smp5 23.2
我想要做的是添加一个新列,其中从所有样本(smp)中减去了引用(ref)。 像这样:
value deltaValue
SampleGroup sample
Group1 ref 18.1 0
smp1 NaN NaN
smp2 20.3 2.2
smp3 30.0 11.9
smp4 23.8 5.7
smp5 23.2 5.1
有谁知道如何做到这一点? 谢谢!
答案 0 :(得分:2)
好的,我打破了以下对我有用的事情:
In [327]:
t="""sample value
ref 18.1
smp1 NaN
smp2 20.3
smp3 30.0
smp4 23.8
smp5 23.2"""
df = pd.read_csv(io.StringIO(t), sep='\s+')
df
Out[327]:
sample value
0 ref 18.1
1 smp1 NaN
2 smp2 20.3
3 smp3 30.0
4 smp4 23.8
5 smp5 23.2
In [328]:
df['Group'] = 'Group1'
df
Out[328]:
sample value Group
0 ref 18.1 Group1
1 smp1 NaN Group1
2 smp2 20.3 Group1
3 smp3 30.0 Group1
4 smp4 23.8 Group1
5 smp5 23.2 Group1
In [329]:
df1 = df.set_index(['Group', 'sample'])
df1
Out[329]:
value
Group sample
Group1 ref 18.1
smp1 NaN
smp2 20.3
smp3 30.0
smp4 23.8
smp5 23.2
In [337]:
df1['deltaValue'] = df1['value'].sub(df1.loc[('Group1','ref')]['value'])
df1
Out[337]:
value deltaValue
Group sample
Group1 ref 18.1 0.0
smp1 NaN NaN
smp2 20.3 2.2
smp3 30.0 11.9
smp4 23.8 5.7
smp5 23.2 5.1
以下工作:
df1['deltaValue'] = df1['value'] - df1.loc[('Group1','ref')]['value']