改组数据帧

时间:2019-04-05 15:10:53

标签: python pandas

我有以下Pandas数据框:

import pandas as pd
timestamps = [pd.Timestamp(2015,1,1), pd.Timestamp(2015,1,3), pd.Timestamp(2015,4,1), pd.Timestamp(2015,11,1)]
quantities = [1, 16, 9, 4]
e_quantities = [1, 4, 3, 2]
data = dict(quantities=quantities, e_quantities=e_quantities)
df = pd.DataFrame(data=data, columns=data.keys(), index=timestamps)

如下所示:

            quantities  e_quantities
2015-01-01           1             1
2015-01-03          16             4
2015-04-01           9             3
2015-11-01           4             2

我想对index中的所有列 进行改组,但保持所有行都匹配。我已经做到了:

import numpy as np
indices_scrambled = np.arange(0, len(timestamps))
np.random.shuffle(indices_scrambled)
df.quantities = df.quantities.values[indices_scrambled]
df.e_quantities = df.e_quantities.values[indices_scrambled]

可以正常工作并产生的

            quantities  e_quantities
2015-01-01          16             4
2015-01-03           9             3
2015-04-01           1             1
2015-11-01           4             2

但是如果我添加很多列,扩展得不好,因为我必须继续写df.column_1 = df.column_1.values[indices_scrambleddf.column_2 = df.column_2.values[indices_scrambled等。

是否有一种方法可以一次扰乱数据帧中除索引1以外的所有列?

感谢您的任何帮助!

2 个答案:

答案 0 :(得分:1)

这应该对您有用

from sklearn.utils import shuffle
index = df.index
df = shuffle(df)
df.index = index

答案 1 :(得分:0)

尝试以下操作,它在列循环中使用相同的np.random.shuffle()

for col in df.columns.to_list():
     np.random.shuffle(df[col])

print(df)