使用串联数据减去Dataframe

时间:2017-06-09 02:18:03

标签: python pandas

我有一个pandas.DataFrame,其中包含多列数据

示例数据帧:

Sample_ID, NaX, NaU
1,         1.0, 2.3
2,         3.4, 2.0

数据系列:

Sample_ID: Blank
NaX        0.2
NaU        0.1

有没有办法用系列中的值减去DataFrame?

我要找的最终结果如下: 示例数据框空白已更正:

Sample_ID, NaX, NaU
1,         0.8, 2.2
2,         3.2, 1.9

感谢。

更大的数据系列:

Sample_ID,572,,
NaX,3073.333333,,
NaU,126.666667,,
MgX,3081.666667,,
MgU,69.333333,,
AlX,5275.333333,,
AlU,48.333333,,
SiX,554966.6667,,
SiU,366.666667,,
PX,294.866667,,
PU,3.333333,,
SX,0,,
SU,0,,
ClX,153.033333,,
ClU,1.266667,,
ArX,NaN,,
ArU,NaN,,
KX,684.666667,,
KU,13.666667,,
CaX,6771.333333,,
CaU,33.666667,,
ScX,43,,
ScU,12,,
TiX,75.433333,,
TiU,4.166667,,
VX,12.533333,,
VU,3.633333,,
CrX,74.6,,
CrU,3.033333,,
MnX,35.6,,
,...,,
AgX,0,,
AgU,0,,
CdX,0,,
CdU,0,,
SnX,0,,
SnU,0,,
SbX,0,,
SbU,0,,
TeX,0,,
TeU,0,,
IX,0,,
IU,0,,
CsX,0,,
CsU,0,,
BaX,0,,
BaU,0,,
LaX,0,,
LaU,0,,
CeX,0,,
CeU,0,,
SmX,0,,
SmU,0,,
WX,0,,
WU,0,,
HgX,0,,
HgU,0,,
PbX,0,,
PbU,0,,
BiX,0,,
BiU,0,,
Length:,"87,",dtype:,float64

更大的DataFrame:

Sample_ID,NaX,NaU,MgX,MgU,AlX,AlU,SiX,SiU,PX,PU,...,SmX,SmU,WX,WU,HgX,HgU,PbX,PbU,BiX,BiU
6,332,9470,230,2680,110,6257,55,372700,300,1836,...,0,0,0,0,0,0,297,10,0,0
7,332_Repeat,8940,230,2690,110,6199,55,383500,300,1754,...,0,0,0,0,0,0,215,11,0,0
8,346,10470,260,2500,120,7004,56,253300,200,2586,...,0,0,0,0,0,0,676,13,0,0
9,347,2740,160,1530,79,4799,51,521200,300,530.8,...,0,0,0,0,0,0,107.3,8.8,0,0
10,348,5260,190,1749,91,5506,53,448400,300,1143,...,0,0,0,0,0,0,211,10,0,0
11,348_Repeat,5510,190,1795,91,5486,53,447600,300,1138,...,0,0,0,0,0,0,174,10,0,0
17,427,0,0,3484,75,5093,48,529000,300,560,...,0,0,0,0,0,0,0,0,0,0
18,427_Repeat,0,0,3598,76,5096,48,529900,300,557.8,...,0,0,0,0,0,0,0,0,0,0
19,428,3410,140,5602,86,7590,56,562600,300,794.2,...,0,0,0,0,0,0,8.3,7.3,0,0
20,429,0,0,3977,78,5107,49,530300,300,594.6,...,0,0,0,0,0,0,0,0,0,0
21,430,3530,140,5626,88,7944,57,559800,300,899,...,0,0,0,0,0,0,3.8,3.8,0,0
22,447,139200,300,0,0,2473,27,135200,100,432.2,...,0,0,0,0,0,0,0,0,0,0
23,447_Repeat,138900,300,0,0,2504,26,135400,100,440.3,...,0,0,0,0,0,0,0,0,0,0
24,448,141900,400,0,0,1829,26,73970,60,419.7,...,0,0,0,0,0,0,0,0,0,0
25,449,169700,400,0,0,2034,26,40150,40,420.5,...,0,0,0,0,0,0,0,0,0,0
26,567,168400,600,9460,200,1894,52,20560,40,1474,...,0,0,0,0,0,0,1.2,1.2,0,0
27,568,169300,600,3230,190,1455,51,11370,30,1414,...,0,0,0,0,0,0,6.9,6.9,0,0
28,568_Repeat,169400,600,3200,190,1462,51,11340,30,1406,...,0,0,0,0,0,0,5.3,5.3,0,0
35,7320,174700,500,3470,110,8720,48,129100,100,452.7,...,0,0,0,0,0,0,19,8.2,0,0
36,7323,176500,500,0,0,4928,51,71390,80,572.8,...,0,0,0,0,0,0,17.2,8.1,0,0
37,7326,56390,220,26440,110,24600,60,320900,200,242.1,...,0,0,0,0,0,0,22.3,7.8,0,0

系列和帧作为图像:

tutorial

Image 1

2 个答案:

答案 0 :(得分:1)

如果df是数据框,而s是系列,则df-s会这样做。熊猫会像numpy一样做广播。

答案 1 :(得分:1)

测试代码:

import pandas as pd
df = pd.DataFrame([[1, 1.0, 2.3], [2, 3.4, 2.0]],
                  columns=["Sample_ID", "NaX", "NaU"])\
    .set_index('Sample_ID')

s = pd.DataFrame([['NaX', 0.2], ['NaU', 0.1]],
                   columns=['Sample_ID', 'Blank'])\
    .set_index('Sample_ID').Blank

print(df)
print(s)
print(df-s)

结果:

           NaX  NaU
Sample_ID          
1          1.0  2.3
2          3.4  2.0

Sample_ID
NaX    0.2
NaU    0.1
Name: Blank, dtype: float64

           NaX  NaU
Sample_ID          
1          0.8  2.2
2          3.2  1.9