Question

我有两个不同的numpy.array A [m，n]和B [m，p]。我想分别从A和B创建两个随机样本，即A1 [m1，n]，A2 [m2，n]和B1 [m1，p]，B2 [m2，p]。

或者特别是如果我采用两个对应于行的索引的随机样本，即

rnd1 = [random.randint(1,m) for r in xrange(m1)]
rnd2 = [random.randint(1,m) for r in xrange(m2)]

并尝试将子数组创建为

A1=[A[i,:] for i in rnd1]
A2=[A[i,:] for i in rnd2]

和

B1=[B[i,:] for i in rnd1]
B2=[B[i,:] for i in rnd2]

A1和B1中的行序列相同，类似的行序列也与A2和B2相同。但是rand1和rnd2不是互斥的。

如何创建互斥集？

Answer 1

这样做的一种方法是将[0, m - 1]到idx1系列或idx2系列中的每个索引分配给相同的概率，然后从idx1和{进行采样{1}}：

idx2

唯一需要注意的是，您需要重做标记，直到m, m1, m2 = 10, 5, 7 # tag each index with equal probabilities to True & False # note: p = m1 / ( m1 + m2 ) can make more sense depending # on desired distribution tag = np.random.binomial(n=1, p=.5, size=m) == 1 # assign True indices to idx1 and False indices to index 2 idx = np.array( range( m ) ) idx1, idx2 = idx[ tag ], idx[ np.logical_not( tag ) ] # sample from idx1 and idx2 i1, i2 = np.random.choice( idx1, size=m1 ), np.random.choice( idx2, size=m2 )和sum( tag ) != 0; sum( tag ) != ml和idx1都是非空的。

您也可以在上面设置idx2，具体取决于您要查找的发布内容。

在python中创建两组互斥的随机样本

1 个答案: