如何有效地从pandas数据帧中减去每一行?

时间:2017-11-21 12:48:34

标签: python pandas dataframe

df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), 
columns=list('ABCD'))

dfs = []

for index in range(len(df)):
    subtracted = df - df.loc[index]
    dfs.append(subtracted)

有没有办法做到这一点,或许应用?像上面这样做对于大型数据帧来说相当慢......

1 个答案:

答案 0 :(得分:4)

IIUC:

样本DF:

In [124]: df = pd.DataFrame(np.arange(9).reshape(3,3), columns=list('abc'))

In [125]: df
Out[125]:
   a  b  c
0  0  1  2
1  3  4  5
2  6  7  8

获取dfs

In [126]: (df.values - df.values[:, None])
Out[126]:
array([[[ 0,  0,  0],
        [ 3,  3,  3],
        [ 6,  6,  6]],

       [[-3, -3, -3],
        [ 0,  0,  0],
        [ 3,  3,  3]],

       [[-6, -6, -6],
        [-3, -3, -3],
        [ 0,  0,  0]]])

获取subtracted

In [127]: (df.values - df.values[:, None])[-1]
Out[127]:
array([[-6, -6, -6],
       [-3, -3, -3],
       [ 0,  0,  0]])

一些解释:

df.values[:, None]

df.values[:, np.newaxis]的同义词:

In [132]: df.values[:, np.newaxis]
Out[132]:
array([[[0, 1, 2]],

       [[3, 4, 5]],

       [[6, 7, 8]]])