Question

我有一个像这样的数据框：

'a'                   'b'    'c'    'd'               'e'  'f'
'hello.text'           1      2      'hello2.text'     2   10
'hello3.text'          5      8      'hello4.text'     8   15

现在我需要将“ a”，“ b”，“ c”列进行混洗或随机分组。像这样的东西：

'a'                   'b'    'c'    'd'               'e'  'f'
'hello3.text'          5      8      'hello2.text'     2   10
'hello.text'           1      2      'hello4.text'     8   15

我该怎么做？

Answer 1

使用np.random.permutation和DataFrame.apply分别处理每一列，因为数据类型不同：

cols = ['a','b','c']

df[cols] = df[cols].apply(lambda x: np.random.permutation(x))
print (df)
               a  b  c              d  e   f
0   'hello.text'  5  2  'hello2.text'  2  10
1  'hello3.text'  1  8  'hello4.text'  8  15

Answer 2

将'a', 'b', 'c'列一起随机化，是否意味着仅对这些特定列的行进行混洗？如果是，那么以下将满足您的需求：

cols = ['a','b','c']
df[cols] = df[cols].sample(frac=1.0, random_state=0).reset_index(drop=True)
print(df)

            a  b  c            d  e   f
0  hello3.txt  5  8  hello2.text  2  10
1  hello.text  1  2  hello4.text  8  15

您可以使用random_state参数控制随机化。

在数据帧中改组多列

2 个答案: