在数据框中减去两列

时间:2018-01-19 23:12:38

标签: python pandas

我的df如下:

Index    Country    Val1  Val2 ... Val10
1        Australia  1     3    ... 5
2        Bambua     12    33   ... 56
3        Tambua     14    34   ... 58

我想从Val1中为每个国家/地区减去Val10,因此输出如下:

Country    Val10-Val1
Australia  4
Bambua     23
Tambua     24

到目前为止,我已经:

def myDelta(row):
    data = row[['Val10', 'Val1']]
    return pd.Series({'Delta': np.subtract(data)})

def runDeltas():
    myDF = getDF() \
        .apply(myDelta, axis=1) \
        .sort_values(by=['Delta'], ascending=False)
    return myDF

runDeltas导致此错误:

ValueError: ('invalid number of arguments', u'occurred at index 9')

解决这个问题的正确方法是什么?

5 个答案:

答案 0 :(得分:5)

鉴于以下数据框:

df = pd.DataFrame([["Australia", 1, 3, 5],
                   ["Bambua", 12, 33, 56],
                   ["Tambua", 14, 34, 58]
                  ], columns=["Country", "Val1", "Val2", "Val10"]
                 )

归结为一个简单的broadcasting operation

>>> val1_minus_val10 = df["Val1"] - df["Val10"]
>>> print(val1_minus_val10)
0    -4
1   -44
2   -44
dtype: int64

答案 1 :(得分:3)

将此作为df:

df = pd.DataFrame([["Australia", 1, 3, 5],
               ["Bambua", 12, 33, 56],
               ["Tambua", 14, 34, 58]
              ], columns=["Country", "Val1", "Val2", "Val10"]
             )

您也可以进行减法并将其放入新列中,如下所示。

>>>df['Val_Diff'] = df['Val10'] - df['Val1']

    Country     Val1    Val2  Val10 Val_Diff
0   Australia   1       3      5    4
1   Bambua      12      33     56   44
2   Tambua      14      34     58   44

答案 2 :(得分:1)

您还可以使用 pandas.DataFrame.assign 函数:e,g

import numpy as np
import pandas as pd

df = pd.DataFrame([["Australia", 1, 3, 5],
                   ["Bambua", 12, 33, 56],
                   ["Tambua", 14, 34, 58]
                  ], columns=["Country", "Val1", "Val2", "Val10"]
                 )

df = df.assign(Val10_minus_Val1 = df['Val10'] - df['Val1'])

分配的最好部分是您可以根据需要添加任意多个分配。例如得到差异,然后记录差异

df = df.assign(Val10_minus_Val1 = df['Val10'] - df['Val1'], log_result = lambda x: np.log(x.Val10_minus_Val1) )

结果: enter image description here

答案 3 :(得分:0)

我今天面对的一切使我雄心勃勃地与您分享。如上所述,您可以轻松使用:

df['Val10-Val1'] = df['Val10']-df['Val1']

但是有时您可能需要使用Apply功能,因此您可以使用以下行:

df['Val10-Val1'] = df.apply(lambda row: row['Val10']-row['Val1'])

答案 4 :(得分:0)

您可以使用lambda函数并将其分配给新列。

df['Val10-Val1'] = df.apply(lambda x: x['Val10'] - x['Val1'], axis=1)
print df