按列更改多维数组并相应地更新索引列表

时间:2017-09-13 18:14:03

标签: python arrays performance numpy random

根据N列数组给出M行,我需要通过 columns 对其进行随机播放,同时更新单独的(唯一)列索引列表指向洗牌元素的新位置。

例如,请使用以下(3, 5)数组

a = [[ 0.15337424  0.21176979  0.19846229  0.5245618   0.24452392]
     [ 0.17460481  0.45727362  0.26914808  0.81620202  0.8898504 ]
     [ 0.50104826  0.22457154  0.24044079  0.09524352  0.95904348]]

和列索引列表:

idxs = [0 3 4]

如果我按列对数组进行洗牌,那么它看起来像这样:

a = [[ 0.24452392  0.19846229  0.5245618   0.21176979  0.15337424]
     [ 0.8898504   0.26914808  0.81620202  0.45727362  0.17460481]
     [ 0.95904348  0.24044079  0.09524352  0.22457154  0.50104826]]

索引数组应该修改为如下所示:

idxs = [4 2 0]

我可以通过在shuffle之前和之后转置它来按列移动数组(参见下面的代码),但我不确定如何更新索引列表。整个过程需要尽可能快,因为新阵列将执行数百万次。

import numpy as np

def getData():
    # Array of (N, M) dimensions
    N, M = 10, 500
    a = np.random.random((N, M))

    # List of unique column indexes in a.
    # This list could be empty, or it could have a length of 'M'
    # (ie: contain all the indexes in the range of 'a').
    P = int(M * np.random.uniform())
    idxs = np.arange(0, M)
    np.random.shuffle(idxs)
    idxs = idxs[:P]

    return a, idxs


a, idxs = getData()

# Shuffle a by columns
b = a.T
np.random.shuffle(b)
a = b.T

# Update the 'idxs' list?

3 个答案:

答案 0 :(得分:1)

使用np.random.permutation -

获取列索引的随机排列
col_idx = np.random.permutation(a.shape[1])

获取随机输入数组 -

shuffled_a = a[:,col_idx]

然后,只需索引col_idx的已排序索引,其中idxs为已追溯的版本 -

shuffled_idxs = col_idx.argsort()[idxs]

示例运行 -

In [236]: a # input array
Out[236]: 
array([[ 0.1534,  0.2118,  0.1985,  0.5246,  0.2445],
       [ 0.1746,  0.4573,  0.2691,  0.8162,  0.8899],
       [ 0.501 ,  0.2246,  0.2404,  0.0952,  0.959 ]])

In [237]: col_idx = np.random.permutation(a.shape[1])

# Let's use the sample permuted column indices to verify desired o/p
In [238]: col_idx = np.array([4,2,3,1,0])

In [239]: shuffled_a = a[:,col_idx]

In [240]: shuffled_a
Out[240]: 
array([[ 0.2445,  0.1985,  0.5246,  0.2118,  0.1534],
       [ 0.8899,  0.2691,  0.8162,  0.4573,  0.1746],
       [ 0.959 ,  0.2404,  0.0952,  0.2246,  0.501 ]])

In [241]: col_idx.argsort()[idxs]
Out[241]: array([4, 2, 0])

答案 1 :(得分:0)

interact(target).draggable({onmove: dragMoveListener})
function dragMoveListener (event) {
  var target = event.target,
  // keep the dragged position in the data-x/data-y attributes
  x = (parseFloat(target.getAttribute('data-x')) || 0) + event.dx,
  y = (parseFloat(target.getAttribute('data-y')) || 0) + event.dy;

  // translate the element
  target.style.webkitTransform = target.style.transform
                               = 'translate(' + x + 'px, ' + y + 'px)';
  // update the posiion attributes
  target.setAttribute('data-x', x);
  target.setAttribute('data-y', y);
}

答案 2 :(得分:0)

数据数组必须使用索引数组进行混洗,因此首先将索引数组洗牌并使用它来对数据数组进行洗牌。

import numpy as np

def getData():
    # Array of (N, M) dimensions
    a = np.arange(15).reshape(3, 5)
    # [[ 0  1  2  3  4]
    # [ 5  6  7  8  9]
    # [10 11 12 13 14]]
    idxs = np.arange(a.shape[0]) #  [0 1 2]
    return a, idxs

a, idxs = getData()

# Shuffle a by columns
b = a.T
# [[ 0  5 10]
# [ 1  6 11]
# [ 2  7 12]
# [ 3  8 13]
# [ 4  9 14]]

np.random.shuffle(idxs)  #  [2 0 1]
a = b[:, idxs]

# [[10  0  5]
# [11  1  6]
# [12  2  7]
# [13  3  8]
# [14  4  9]]

所以如果你想要将任何其他数组称为x以匹配数组a的混乱,那么idxs将非常有用