我的df如下:
Index Country Val1 Val2 ... Val10
1 Australia 1 3 ... 5
2 Bambua 12 33 ... 56
3 Tambua 14 34 ... 58
我想从Val1中为每个国家/地区减去Val10,因此输出如下:
Country Val10-Val1
Australia 4
Bambua 23
Tambua 24
到目前为止,我已经:
def myDelta(row):
data = row[['Val10', 'Val1']]
return pd.Series({'Delta': np.subtract(data)})
def runDeltas():
myDF = getDF() \
.apply(myDelta, axis=1) \
.sort_values(by=['Delta'], ascending=False)
return myDF
runDeltas导致此错误:
ValueError: ('invalid number of arguments', u'occurred at index 9')
解决这个问题的正确方法是什么?
答案 0 :(得分:5)
鉴于以下数据框:
df = pd.DataFrame([["Australia", 1, 3, 5],
["Bambua", 12, 33, 56],
["Tambua", 14, 34, 58]
], columns=["Country", "Val1", "Val2", "Val10"]
)
归结为一个简单的broadcasting operation:
>>> val1_minus_val10 = df["Val1"] - df["Val10"]
>>> print(val1_minus_val10)
0 -4
1 -44
2 -44
dtype: int64
答案 1 :(得分:3)
将此作为df:
df = pd.DataFrame([["Australia", 1, 3, 5],
["Bambua", 12, 33, 56],
["Tambua", 14, 34, 58]
], columns=["Country", "Val1", "Val2", "Val10"]
)
您也可以进行减法并将其放入新列中,如下所示。
>>>df['Val_Diff'] = df['Val10'] - df['Val1']
Country Val1 Val2 Val10 Val_Diff
0 Australia 1 3 5 4
1 Bambua 12 33 56 44
2 Tambua 14 34 58 44
答案 2 :(得分:1)
您还可以使用 pandas.DataFrame.assign 函数:e,g
import numpy as np
import pandas as pd
df = pd.DataFrame([["Australia", 1, 3, 5],
["Bambua", 12, 33, 56],
["Tambua", 14, 34, 58]
], columns=["Country", "Val1", "Val2", "Val10"]
)
df = df.assign(Val10_minus_Val1 = df['Val10'] - df['Val1'])
分配的最好部分是您可以根据需要添加任意多个分配。例如得到差异,然后记录差异
df = df.assign(Val10_minus_Val1 = df['Val10'] - df['Val1'], log_result = lambda x: np.log(x.Val10_minus_Val1) )
答案 3 :(得分:0)
我今天面对的一切使我雄心勃勃地与您分享。如上所述,您可以轻松使用:
df['Val10-Val1'] = df['Val10']-df['Val1']
但是有时您可能需要使用Apply功能,因此您可以使用以下行:
df['Val10-Val1'] = df.apply(lambda row: row['Val10']-row['Val1'])
答案 4 :(得分:0)
您可以使用lambda函数并将其分配给新列。
df['Val10-Val1'] = df.apply(lambda x: x['Val10'] - x['Val1'], axis=1)
print df