df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)),
columns=list('ABCD'))
dfs = []
for index in range(len(df)):
subtracted = df - df.loc[index]
dfs.append(subtracted)
有没有办法做到这一点,或许应用?像上面这样做对于大型数据帧来说相当慢......
答案 0 :(得分:4)
IIUC:
样本DF:
In [124]: df = pd.DataFrame(np.arange(9).reshape(3,3), columns=list('abc'))
In [125]: df
Out[125]:
a b c
0 0 1 2
1 3 4 5
2 6 7 8
获取dfs
:
In [126]: (df.values - df.values[:, None])
Out[126]:
array([[[ 0, 0, 0],
[ 3, 3, 3],
[ 6, 6, 6]],
[[-3, -3, -3],
[ 0, 0, 0],
[ 3, 3, 3]],
[[-6, -6, -6],
[-3, -3, -3],
[ 0, 0, 0]]])
获取subtracted
:
In [127]: (df.values - df.values[:, None])[-1]
Out[127]:
array([[-6, -6, -6],
[-3, -3, -3],
[ 0, 0, 0]])
一些解释:
df.values[:, None]
是df.values[:, np.newaxis]的同义词:
In [132]: df.values[:, np.newaxis]
Out[132]:
array([[[0, 1, 2]],
[[3, 4, 5]],
[[6, 7, 8]]])