我有一个维度(m,n)
的数组。数组第一列中的值对于某个行的子集是通用的,我想随机地对整个数组的行进行混洗,同时将第一列中共享相同值的行保持在一起。
如果我使用numpy.random.shuffle()
,则会不加选择地对所有行进行随机播放。但我想确保第一列中具有相同值的所有行在数组中按顺序保持在一起。我可以创建的任何临时方法看起来有点麻烦,但这基本上是我的目标:
示例
输入:
array([[ 120325, 0.053, 4.23],
[ 120325, 32.232, 5.2],
[ 321, 243.4, 454],
[ 321, 4533.4, 232],
[ 321, 23.5, 108],
[ 27, 0, 454],
[ 27, 10, 32.0]])
输出(应该分批随机洗牌):
array([[ 321, 243.4, 454],
[ 321, 4533.4, 232],
[ 321, 23.5, 108],
[ 27, 0, 454],
[ 27, 10, 32.0],
[ 120325, 0.053, 4.23],
[ 120325, 32.232, 5.2]])
答案 0 :(得分:0)
您可以使用itertools.groupby()
来完成此任务:
x = [[ 120325, 0.053, 4.23],
[ 120325, 32.232, 5.2],
[ 321, 243.4, 454],
[ 321, 4533.4, 232],
[ 321, 23.5, 108],
[ 27, 0, 454],
[ 27, 10, 32.0]]
together = [list(i) for _, i in itertools.groupby(x, operator.itemgetter(0))]
# [[[120325, 0.053, 4.23], [120325, 32.232, 5.2]], [[321, 243.4, 454], [321, 4533.4, 232], [321, 23.5, 108]], [[27, 0, 454], [27, 10, 32.0]]]
random.shuffle(together)
final = [i for j in together for i in j]
print(final)
输出(多次运行):
[[321, 243.4, 454], [321, 4533.4, 232], [321, 23.5, 108], [27, 0, 454], [27, 10, 32.0], [120325, 0.053, 4.23], [120325, 32.232, 5.2]]
[[120325, 0.053, 4.23], [120325, 32.232, 5.2], [321, 243.4, 454], [321, 4533.4, 232], [321, 23.5, 108], [27, 0, 454], [27, 10, 32.0]]
[[27, 0, 454], [27, 10, 32.0], [120325, 0.053, 4.23], [120325, 32.232, 5.2], [321, 243.4, 454], [321, 4533.4, 232], [321, 23.5, 108]]