为pandas中的每组行减去不同的参考值

时间:2016-12-09 10:53:30

标签: python pandas dataframe

我搜索并发现这个答案很接近,但我不太清楚如何将它应用到我自己的情况,因为我的参考值不存储在同一个数据帧中。

Subtracting group specific value from rows in pandas

我有一个如下数据框,我想从"各向同性移位"中减去不同的参考值。列取决于哪个Nucleus存在(在这种情况下为C和H,但原则上可以是周期表中的任何值):

REF_H = 30
REF_C = 180
df
    Atom Number Nucleus  Isotropic Shift
0             1       C          49.3721
1             2       C          52.9650
2             3       C          36.3443
3             4       C          50.8163
4             5       C          50.0493
5             6       C          49.7985
6             7       H          24.0772
7             8       H          23.7986
8             9       H          24.2922
9            10       H          24.1632
10           11       H          24.1572
11           12       C         102.9401

所以我希望这返回一个delta列,其中值是相应的Ref_H或Ref_C值减去各向同性的移位:

modifieddf.tail(2)
    Atom Number Nucleus  Isotropic Shift    Delta
10           11       H          24.1572   5.8428
11           12       C         102.9401  77.0599

到目前为止,我提出的最好的是:

def generateHandC(df):
    h = df[df['Nucleus'] == 'H']
    h['delta'] = REF_H - h['Isotropic Shift']
    c = df[df['Nucleus'] == 'C']
    c['delta'] = REF_C - c['Isotropic Shift']
    return h, c

generateHandC(df)

Output:
(    Atom Number Nucleus  Isotropic Shift   delta
6             7       H          24.0772  5.9228
7             8       H          23.7986  6.2014
8             9       H          24.2922  5.7078
9            10       H          24.1632  5.8368
10           11       H          24.1572  5.8428
14           15       H          28.3212  1.6788
15           16       H          28.0110  1.9890
17           18       H          29.2324  0.7676
18           19       H          26.7298  3.2702,     Atom Number Nucleus  Isotropic Shift     delta
0             1       C          49.3721  130.6279
1             2       C          52.9650  127.0350
2             3       C          36.3443  143.6557
3             4       C          50.8163  129.1837
4             5       C          50.0493  129.9507
5             6       C          49.7985  130.2015
11           12       C         102.9401   77.0599
13           14       C         122.3188   57.6812)

但这绝对不是最佳的,它会将数据框作为列表返回,并向我抛出SettingWithCopyWarning。理想情况下,我想返回原始数据帧以及delta值的额外列。谢谢!

2 个答案:

答案 0 :(得分:2)

您可以在Nucleus之后mapdict,然后按sub减去:

REF_H = 30
REF_C = 180
d = {'C': REF_C, 'H':REF_H}
df['Delta'] =  df.Nucleus.map(d).sub(df['Isotropic Shift'])
print (df)
    Atom  Number Nucleus  Isotropic Shift     Delta
0      0       1       C          49.3721  130.6279
1      1       2       C          52.9650  127.0350
2      2       3       C          36.3443  143.6557
3      3       4       C          50.8163  129.1837
4      4       5       C          50.0493  129.9507
5      5       6       C          49.7985  130.2015
6      6       7       H          24.0772    5.9228
7      7       8       H          23.7986    6.2014
8      8       9       H          24.2922    5.7078
9      9      10       H          24.1632    5.8368
10    10      11       H          24.1572    5.8428
11    11      12       C         102.9401   77.0599

答案 1 :(得分:0)

df.ix[df.Nucleus == 'H','Reference Value'] = 30
df.ix[df.Nucleus == 'C','Reference Value'] = 180

df['delta'] = df['Reference Value'] - df['Isotropic Shift']

Atom Number     Nucleus    Isotropic Shift    Reference Value    delta
1               C          49.3721            180.0              130.6279 
2               C          52.9650            180.0              127.0350 
3               C          36.3443            180.0              143.6557 
4               C          50.8163            180.0              129.1837 
5               C          50.0493            180.0              129.9507 
6               C          49.7985            180.0              130.2015 
7               H          24.0772            30.0               5.9228 
8               H          23.7986            30.0               6.2014 
9               H          24.2922            30.0               5.7078 
10              H          24.1632            30.0               5.8368 
11              H          24.1572            30.0               5.8428 
12              C          102.9401           180.0              77.0599