将某些列除以pandas中的另一列

时间:2017-08-26 20:56:52

标签: python pandas dataframe

想知道是否有更有效的方法将多列分成某列。比如说我有:

prev    open    close   volume
20.77   20.87   19.87   962816
19.87   19.89   19.56   668076
19.56   19.96   20.1    578987
20.1    20.4    20.53   418597

我想得到:

prev    open    close   volume
20.77   1.0048  0.9567  962816
19.87   1.0010  0.9844  668076
19.56   1.0204  1.0276  578987
20.1    1.0149  1.0214  418597

基本上,列打开'和'关闭''已除以列' prev。'

中的值

我能够通过

来做到这一点
df['open'] = list(map(lambda x,y: x/y, df['open'],df['prev']))
df['close'] = list(map(lambda x,y: x/y, df['close'],df['prev']))

我想知道是否有更简单的方法?特别是如果有10列被分成相同的值呢?

3 个答案:

答案 0 :(得分:5)

为了提高性能,我建议使用底层数组数据和array-slicing作为要修改的两列按顺序使用视图 -

a = df.values
df.iloc[:,1:3] = a[:,1:3]/a[:,0,None]

为了更多地阐述阵列切片部分,a[:,[1,2]]会在那里强制复制,并且会减慢它的速度。数据框一侧的a[:,[1,2]]相当于df[['open','close']],我猜也在减慢速度。因此,df.iloc[:,1:3]正在改进它。

示例运行 -

In [64]: df
Out[64]: 
    prev   open  close  volume
0  20.77  20.87  19.87  962816
1  19.87  19.89  19.56  668076
2  19.56  19.96  20.10  578987
3  20.10  20.40  20.53  418597

In [65]: a = df.values
    ...: df.iloc[:,1:3] = a[:,1:3]/a[:,0,None]
    ...: 

In [66]: df
Out[66]: 
    prev      open     close  volume
0  20.77  1.004815  0.956668  962816
1  19.87  1.001007  0.984399  668076
2  19.56  1.020450  1.027607  578987
3  20.10  1.014925  1.021393  418597

运行时测试

方法 -

def numpy_app(df): # Proposed in this post
    a = df.values
    df.iloc[:,1:3] = a[:,1:3]/a[:,0,None]
    return df

def pandas_app1(df): # @Scott Boston's soln
    df[['open','close']] = df[['open','close']].div(df['prev'].values,axis=0)
    return df

计时 -

In [44]: data = np.random.randint(15, 25, (100000,4)).astype(float)
    ...: df1 = pd.DataFrame(data, columns=(('prev','open','close','volume')))
    ...: df2 = df1.copy()
    ...: 

In [45]: %timeit pandas_app1(df1)
    ...: %timeit numpy_app(df2)
    ...: 
100 loops, best of 3: 2.68 ms per loop
1000 loops, best of 3: 885 µs per loop

答案 1 :(得分:4)

columns_to_divide = ['open', 'close']
df[columns_to_divide] = df[columns_to_divide] / df['prev']

答案 2 :(得分:2)

df2[['open','close']] = df2[['open','close']].div(df2['prev'].values,axis=0)

输出:

    prev      open     close  volume
0  20.77  1.004815  0.956668  962816
1  19.87  1.001007  0.984399  668076
2  19.56  1.020450  1.027607  578987
3  20.10  1.014925  1.021393  418597