我需要在df上应用一个函数来创建多个新列。作为我的函数的输入,我需要(i)一行或(ii)多列
def divideAndMultiply(x,y):
return x/y, x*y
df["e"], df["f"] = zip(*df.a.apply(lambda val: divideAndMultiply(val,2)))
(https://stackoverflow.com/a/36600318/1831695)
这可以创建多个新列,但不能使用多个输入(=来自该df的列)。或者我错过了什么?
示例我要做的事情:
我的DF有几列。其中3(a,b,c)与计算两个新列(y,z)相关,其中y = a + b + c
和z = c - b - a
我知道这是一个简单的计算,我不需要函数,但我们假设我们需要一个函数。 而不是编写和应用2个函数,我想只有一个函数,返回两个值并接受所有三个值(甚至更好:一行)进行计算。
这个例子:
df["y"], df["z"] = zip(*df.a.apply(lambda val: divideAndMultiply(val,2)))
仅在使用一个列数据项(val)和其他值(在本例中为2)时有效。
我需要这样的东西:
df["y"], df["z"] = zip(*df.a.apply(lambda val: divideAndMultiply(df['a'],df['b'],df['c'])))
(是的,我知道val已分配给df.a
)
更新2 这就是我的df的样子:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 4655 entries, 0 to 4654
Data columns (total 71 columns):
Open 4655 non-null float64
Close 4655 non-null float64
High 4655 non-null float64
Low 4655 non-null float64
DateTime 4655 non-null datetime64[ns]
Date 4655 non-null datetime64[ns]
T_CLOSE 4655 non-null float64
T_HIGH 4655 non-null float64
T_LOW 4655 non-null float64
T_OPEN 4655 non-null float64
MVA5 4651 non-null float64
MVA10 4646 non-null float64
MVA14 4642 non-null float64
MVA15 4641 non-null float64
MVA28 4628 non-null float64
MVA50 4606 non-null float64
STD5 4651 non-null float64
STD10 4646 non-null float64
STD15 4641 non-null float64
CV_5 4651 non-null float64
CV_10 4646 non-null float64
CV_15 4641 non-null float64
DIFF_VP_CLOSE 4654 non-null float64
DIFF_VP_HIGH 4654 non-null float64
DIFF_VP_OPEN 4654 non-null float64
DIFF_VP_LOW 4654 non-null float64
AVG_STEIG_5 4650 non-null float64
AVG_STEIG_10 4645 non-null float64
AVG_STEIG_15 4640 non-null float64
AVG_STEIG_28 4627 non-null float64
AVG_5_DIFF 4651 non-null float64
AVG_10_DIFF 4646 non-null float64
AVG_15_DIFF 4641 non-null float64
AVG_14_DIFF 4642 non-null float64
AVG_50_DIFF 4606 non-null float64
AD_5_14 4642 non-null float64
Momentum_4 4651 non-null float64
ROC_4 4652 non-null float64
Momentum_8 4647 non-null float64
ROC_8 4648 non-null float64
Momentum_12 4643 non-null float64
ROC_12 4644 non-null float64
VT_OPEN 4598 non-null float64
VT_CLOSE 4598 non-null float64
VT_HIGH 4598 non-null float64
VT_LOW 4598 non-null float64
PP_VT 4598 non-null float64
R1_VT 4598 non-null float64
R2_VT 4598 non-null float64
R3_VT 4598 non-null float64
S1_VT 4598 non-null float64
S2_VT 4598 non-null float64
S3_VT 4598 non-null float64
DIFF_VT_CLOSE 4598 non-null float64
DIFF_VT_HIGH 4598 non-null float64
DIFF_VT_OPEN 4598 non-null float64
DIFF_VT_LOW 4598 non-null float64
DIFF_T_OPEN 4655 non-null float64
DIFF_T_LOW 4655 non-null float64
DIFF_T_HIGH 4655 non-null float64
DIFF_T_CLOSE 4655 non-null float64
DIFF_VTCLOSE_TOPEN 4598 non-null float64
VP_HIGH 4654 non-null float64
VP_LOW 4654 non-null float64
VP_OPEN 4654 non-null float64
VP_CLOSE 4654 non-null float64
regel_r1 4655 non-null int64
regel_r2 4655 non-null int64
regel_r3 4655 non-null int64
regeln 4655 non-null int64
vormittag_flag 4655 non-null int64
dtypes: datetime64[ns](2), float64(64), int64(5)
memory usage: 2.6 MB
None
Open Close High Low DateTime Date T_CLOSE T_HIGH T_LOW T_OPEN MVA5 MVA10 MVA14 MVA15 MVA28 MVA50 STD5 STD10 STD15 CV_5 CV_10 CV_15 DIFF_VP_CLOSE DIFF_VP_HIGH DIFF_VP_OPEN DIFF_VP_LOW AVG_STEIG_5 AVG_STEIG_10 AVG_STEIG_15 AVG_STEIG_28 AVG_5_DIFF AVG_10_DIFF AVG_15_DIFF AVG_14_DIFF AVG_50_DIFF AD_5_14 Momentum_4 ROC_4 Momentum_8 ROC_8 Momentum_12 ROC_12 VT_OPEN VT_CLOSE VT_HIGH VT_LOW PP_VT R1_VT R2_VT R3_VT S1_VT S2_VT S3_VT DIFF_VT_CLOSE DIFF_VT_HIGH DIFF_VT_OPEN DIFF_VT_LOW DIFF_T_OPEN DIFF_T_LOW DIFF_T_HIGH DIFF_T_CLOSE DIFF_VTCLOSE_TOPEN VP_HIGH VP_LOW VP_OPEN VP_CLOSE regel_r1 regel_r2 regel_r3 regeln vormittag_flag T_DIRC T_WECHSELC T_NUM_INNEN T_CANDLEC
4653 12488.1 12490.1 12490.6 12484.9 2017-05-03 14:00:00 2017-05-03 12490.1 12508.3 12475.4 12506.5 12490.18 12488.41 12487.521429 12487.053333 12493.078571 12498.118 2.178761 4.334218 4.515033 0.000174 0.000347 0.000362 -1.7 2.5 -1.0 -2.7 -9.6 8.5 4.866667 -8.178571 -0.08 1.69 3.046667 2.578571 -8.018 -2.658571 -3.8 0.00004 8.7 0.000577 -0.3 0.000521 12449.3 12514.3 12527.8 12432.0 12491.366667 12550.733333 12587.166667 12646.533333 12454.933333 12395.566667 12359.133333 24.2 37.7 -40.8 -58.1 16.4 -14.7 18.2 0.0 7.8 12492.6 12487.4 12489.1 12488.4 0 0 0 0 0 neutral INNEN 1 GRUEN
4654 12489.9 12489.9 12489.9 12489.6 2017-05-03 14:15:00 2017-05-03 12489.9 12508.3 12475.4 12506.5 12489.38 12488.91 12487.828571 12487.680000 12492.182143 12498.436 0.712039 4.169586 4.180431 0.000057 0.000334 0.000335 0.2 0.7 -1.8 -5.0 -8.0 5.0 6.266667 -8.964286 0.52 0.99 2.220000 2.071429 -8.536 -1.551429 0.3 0.00008 7.0 0.000064 6.3 0.000665 12449.3 12514.3 12527.8 12432.0 12491.366667 12550.733333 12587.166667 12646.533333 12454.933333 12395.566667 12359.133333 24.4 37.9 -40.6 -57.9 16.6 -14.5 18.4 0.0 7.8 12490.6 12484.9 12488.1 12490.1 0 0 0 0 0 neutral INNEN 1 GRUEN
答案 0 :(得分:2)
这是一个带有Quandl数据集
的简单示例import quandl
df = quandl.get("WIKI/GOOGL")
columns = ["High", 'Low', 'Close']
def operations(row, columns):
df1 = row[columns[0]] + row[columns[1]] + row[columns[2]]
df2 = -row[columns[1]] - row[columns[2]] + row[columns[0]]
return df1, df2
df["function1"], df["function2"] = zip(*df.apply(lambda row: operations(row, columns), axis=1))
df[["High","Low","Close","function1", "function2"]].head(5)