创建(x,y)对的随机顺序,而不重复/后续x

时间:2016-04-06 15:05:59

标签: python

说我有一个有效X = [1, 2, 3, 4, 5]列表和一个有效Y = [1, 2, 3, 4, 5]列表。

我需要生成X中每个元素和Y中每个元素(在本例中为25)的所有组合,然后按随机顺序获取这些组合。

这本身很简单,但还有一个要求:在这个随机顺序中,不能连续重复相同的x。例如,这没关系:

[1, 3]
[2, 5]
[1, 2]
...
[1, 4]

这不是:

[1, 3]
[1, 2]  <== the "1" cannot repeat, because there was already one before
[2, 5]
...
[1, 4]

现在,效率最低的想法是只要不再重复就可以随机化全套。我的方法略有不同,反复创建X的混乱变体,以及所有Y * X的列表,然后从中随机选择一个。到目前为止,我已经想出了这个:

import random

output = []
num_x  = 5
num_y  = 5

all_ys = list(xrange(1, num_y + 1)) * num_x

while True:
    # end if no more are available
    if len(output) == num_x * num_y:
        break

    xs = list(xrange(1, num_x + 1))
    while len(xs):
        next_x = random.choice(xs)
        next_y = random.choice(all_ys)

        if [next_x, next_y] not in output:
            xs.remove(next_x)
            all_ys.remove(next_y)
            output.append([next_x, next_y])

print(sorted(output))

但我相信这可以更有效率或更简洁地完成吗?

此外,我的解决方案首先遍历所有X值,然后再次继续完整设置,这不是完全随机。对于我的特定应用案例,我可以忍受这种情况。

9 个答案:

答案 0 :(得分:2)

一个有趣的问题!这是我的解决方案。它具有以下属性:

  • 如果没有有效的解决方案,它应该检测到这一点并让您知道
  • 保证迭代终止,因此永远不会陷入无限循环
  • 任何可能的解决方案都可以非零概率到达

我不知道输出在所有可能的解决方案上的分布,但我认为它应该是统一的,因为算法中没有明显的不对称性。不过,我会感到惊讶和高兴,但却被显示出来了!

import random

def random_without_repeats(xs, ys):
    pairs = [[x,y] for x in xs for y in ys]
    output = [[object()], [object()]]
    seen = set()
    while pairs:
        # choose a random pair from the ones left
        indices = list(set(xrange(len(pairs))) - seen)
        try:
            index = random.choice(indices)
        except IndexError:
            raise Exception('No valid solution exists!')
        # the first element of our randomly chosen pair
        x = pairs[index][0]
        # search for a valid place in output where we slot it in
        for i in xrange(len(output) - 1):
            left, right = output[i], output[i+1]
            if x != left[0] and x != right[0]:
                output.insert(i+1, pairs.pop(index))
                seen = set()
                break
        else:
            # make sure we don't randomly choose a bad pair like that again
            seen |= {i for i in indices if pairs[i][0] == x}
    # trim off the sentinels
    output = output[1:-1]
    assert len(output) == len(xs) * len(ys)
    assert not any(L==R for L,R in zip(output[:-1], output[1:]))
    return output


nx, ny = 5, 5       # OP example
# nx, ny = 2, 10      # output must alternate in 1st index
# nx, ny = 4, 13      # shuffle 'deck of cards' with no repeating suit
# nx, ny = 1, 5       # should raise 'No valid solution exists!' exception

xs = range(1, nx+1)
ys = range(1, ny+1)

for pair in random_without_repeats(xs, ys):
    print pair

答案 1 :(得分:2)

这是我的解决方案。首先,从具有与先前选择的元组不同的x值的那些中选择元组。但是我注意到你必须准备最后的技巧,以便你最后只有不值的元组。

import random

num_x = 5
num_y = 5

all_ys = range(1,num_y+1)*num_x
all_xs = sorted(range(1,num_x+1)*num_y)

output = []

last_x = -1

for i in range(0,num_x*num_y):

    #get list of possible tuple to place    
    all_ind    = range(0,len(all_xs))
    all_ind_ok = [k for k in all_ind if all_xs[k]!=last_x]

    ind = random.choice(all_ind_ok)

    last_x = all_xs[ind]
    output.append([all_xs.pop(ind),all_ys.pop(ind)])


    if(all_xs.count(last_x)==len(all_xs)):#if only last_x tuples,
        break  

if len(all_xs)>0: # if there are still tuples they are randomly placed
    nb_to_place = len(all_xs)
    while(len(all_xs)>0):
        place = random.randint(0,len(output)-1)
        if output[place]==last_x:
            continue
        if place>0:
            if output[place-1]==last_x:
                continue
        output.insert(place,[all_xs.pop(),all_ys.pop()])

print output

答案 2 :(得分:2)

确保平均O(N*M)复杂度的简单解决方案:

def pseudorandom(M,N):
    l=[(x+1,y+1) for x in range(N) for y in range(M)]
    random.shuffle(l)
    for i in range(M*N-1):
            for j in range (i+1,M*N): # find a compatible ...
                if l[i][0] != l[j][0]:
                    l[i+1],l[j] = l[j],l[i+1]
                    break  
            else:   # or insert otherwise.
                while True:
                    l[i],l[i-1] = l[i-1],l[i]
                    i-=1
                    if l[i][0] != l[i-1][0]: break  
    return l

一些测试:

In [354]: print(pseudorandom(5,5))
[(2, 2), (3, 1), (5, 1), (1, 1), (3, 2), (1, 2), (3, 5), (1, 5), (5, 4),\
(1, 3), (5, 2), (3, 4), (5, 3), (4, 5), (5, 5), (1, 4), (2, 5), (4, 4), (2, 4),\ 
(4, 2), (2, 1), (4, 3), (2, 3), (4, 1), (3, 3)]

In [355]: %timeit pseudorandom(100,100)
10 loops, best of 3: 41.3 ms per loop

答案 3 :(得分:2)

这是使用NumPy的解决方案

def generate_pairs(xs, ys):
    n = len(xs)
    m = len(ys)
    indices = np.arange(n)

    array = np.tile(ys, (n, 1))
    [np.random.shuffle(array[i]) for i in range(n)]

    counts = np.full_like(xs, m)
    i = -1

    for _ in range(n * m):
        weights = np.array(counts, dtype=float)
        if i != -1:
            weights[i] = 0
        weights /= np.sum(weights)

        i = np.random.choice(indices, p=weights)
        counts[i] -= 1
        pair = xs[i], array[i, counts[i]]
        yield pair

这是Jupyter notebook that explains how it works

在循环内部,我们必须复制权重,将它们相加,然后使用权重选择随机索引。这些在n中都是线性的。因此,生成所有对的总体复杂度为O(n^2 m)

但运行时是确定性的,而且开销很低。而且我很确定它会以相同的概率生成所有合法序列。

答案 4 :(得分:1)

这应该做你想要的。

rando永远不会连续两次生成相同的X,但我意识到它是可能的(虽然似乎不太可能,因为我从未注意到它发生在10左右我没有额外的检查就跑了,因为可能会丢弃重复对,所以可能会发生在之前的X.哦!但我想我想出来......马上就会更新我的答案。

import random

X = [1,2,3,4,5]
Y = [1,2,3,4,5]


def rando(choice_one, choice_two):
    last_x = random.choice(choice_one)
    while True:
        yield last_x, random.choice(choice_two)
        possible_x = choice_one[:]
        possible_x.remove(last_x)
        last_x = random.choice(possible_x)


all_pairs = set(itertools.product(X, Y))
result = []
r = rando(X, Y)
while set(result) != all_pairs:
    pair = next(r)
    if pair not in result:
        if result and result[-1][0] == pair[0]:
            continue
        result.append(pair)

import pprint
pprint.pprint(result)

答案 5 :(得分:1)

在输出中均匀分配x值(每个值5次):

import random

def random_combo_without_x_repeats(xvals, yvals):
    # produce all valid combinations, but group by `x` and shuffle the `y`s
    grouped = [[x, random.sample(yvals, len(yvals))] for x in xvals]
    last_x = object()  # sentinel not equal to anything
    while grouped[0][1]:  # still `y`s left
        for _ in range(len(xvals)):
            # shuffle the `x`s, but skip any ordering that would
            # produce consecutive `x`s.
            random.shuffle(grouped)
            if grouped[0][0] != last_x:
                break
        else:
            # we tried to reshuffle N times, but ended up with the same `x` value
            # in the first position each time. This is pretty unlikely, but
            # if this happens we bail out and just reverse the order. That is
            # more than good enough.
            grouped = grouped[::-1]
        # yield a set of (x, y) pairs for each unique x
        # Pick one y (from the pre-shuffled groups per x
        for x, ys in grouped:
            yield x, ys.pop()
        last_x = x

首先对y x值进行洗牌,然后为每个x, y提供 x组合。每次迭代都会调整x s的生成顺序,您可以在其中测试限制。

这是随机的,但您会在x位置获得1到5之间的所有数字,然后再次看到相同的数字:

>>> list(random_combo_without_x_repeats(range(1, 6), range(1, 6)))
[(2, 1), (3, 2), (1, 5), (5, 1), (4, 1),
 (2, 4), (3, 1), (4, 3), (5, 5), (1, 4),
 (5, 2), (1, 1), (3, 3), (4, 4), (2, 5),
 (3, 5), (2, 3), (4, 2), (1, 2), (5, 4),
 (2, 2), (3, 4), (1, 3), (4, 5), (5, 3)]

(我手动将其分组为5组)。 整体,这使得您可以通过限制随机改组固定输入集。

也很有效率;因为只有1-in- N 的机会,你必须重新调整x顺序,你应该只看到在整个算法运行期间发生一次重新洗牌。因此,整个算法保持在O(N * M)边界内,非常适用于产生 N M 输出元素的东西。因为我们将重新洗牌最多限制在N次之前,然后再回到简单的反转,我们避免了无休止重组的(非常不可能)的可能性。

唯一的缺点是它必须预先创建 M y值的 N 副本。

答案 6 :(得分:1)

为了完整性,我想我会抛出超级幼稚的东西&#34;只要保持洗牌直到你得到一个&#34;解。它不能保证甚至终止,但如果确实如此,它将具有良好的随机性,你确实说其中一个理想的品质是简洁,这肯定是简洁的:

import itertools
import random

x = range(5)  # this is a list in Python 2
y = range(5)
all_pairs = list(itertools.product(x, y))

s = list(all_pairs)  # make a working copy
while any(s[i][0] == s[i + 1][0] for i in range(len(s) - 1)):
    random.shuffle(s)
print s

正如评论的那样,对于xy(特别是y!)的小值,这实际上是一个相当快速的解决方案。你的每个5的例子在&#34;平均时间&#34;完成。卡片示例(4和13)可能需要更长时间,因为它通常需要数十万次洗牌。 (再次,保证不会终止。)

答案 7 :(得分:1)

这是一种进化算法方法。它首先演变一个列表,其中 window.onload = function() { var element = document.getElementById('content'); element.onselectstart = function () { return false; } // ie element.onmousedown = function () { return false; } // mozilla } 的元素每个重复X次,然后随机填充len(Y) len(X)次的每个元素。由此产生的订单似乎相当随机:

Y

例如:

import random

#the following fitness function measures
#the number of times in which
#consecutive elements in a list
#are equal

def numRepeats(x):
    n = len(x)
    if n < 2: return 0
    repeats = 0
    for i in range(n-1):
        if x[i] == x[i+1]: repeats += 1
    return repeats

def mutate(xs):
    #swaps random pairs of elements
    #returns a new list
    #one of the two indices is chosen so that
    #it is in a repeated pair
    #and swapped element is different

    n = len(xs)
    repeats = [i for i in range(n) if (i > 0 and xs[i] == xs[i-1]) or (i < n-1 and xs[i] == xs[i+1])]
    i = random.choice(repeats)
    j = random.randint(0,n-1)
    while xs[j] == xs[i]: j = random.randint(0,n-1)
    ys = xs[:]
    ys[i], ys[j] = ys[j], ys[i]
    return ys

def evolveShuffle(xs, popSize = 100, numGens = 100):
    #tries to evolve a shuffle of xs so that consecutive
    #elements are different
    #takes the best 10% of each generation and mutates each 9
    #times. Stops when a perfect solution is found
    #popsize assumed to be a multiple of 10

    population = []

    for i in range(popSize):
        deck = xs[:]
        random.shuffle(deck)
        fitness = numRepeats(deck)
        if fitness == 0: return deck
        population.append((fitness,deck))

    for i in range(numGens):
        population.sort(key = (lambda p: p[0]))
        newPop = []
        for i in range(popSize//10):
            fit,deck = population[i]
            newPop.append((fit,deck))
            for j in range(9):
                newDeck = mutate(deck)
                fitness = numRepeats(newDeck)
                if fitness == 0: return newDeck
                newPop.append((fitness,newDeck))
        population = newPop
    #if you get here :
    return [] #no special shuffle found

#the following function takes a list x
#with n distinct elements (n>1) and an integer k
#and returns a random list of length nk
#where consecutive elements are not the same

def specialShuffle(x,k):
    n = len(x)
    if n == 2:
        if random.random() < 0.5:
            a,b = x
        else:
            b,a = x
        return [a,b]*k
    else:
        deck = x*k
        return evolveShuffle(deck)

def randOrder(x,y):
    xs = specialShuffle(x,len(y))
    d = {}
    for i in x:
        ys = y[:]
        random.shuffle(ys)
        d[i] = iter(ys)

    pairs = []
    for i in xs:
        pairs.append((i,next(d[i])))
    return pairs

>>> randOrder([1,2,3,4,5],[1,2,3,4,5]) [(1, 4), (3, 1), (4, 5), (2, 2), (4, 3), (5, 3), (2, 1), (3, 3), (1, 1), (5, 2), (1, 3), (2, 5), (1, 5), (3, 5), (5, 5), (4, 4), (2, 3), (3, 2), (5, 4), (2, 4), (4, 2), (1, 2), (5, 1), (4, 1), (3, 4)] len(X)变大时,找到解决方案会更加困难(并且设计为在该可能性中返回空列表),在这种情况下参数len(Y)和{可以增加{1}}。因此,它能够非常快速地找到20x20解决方案。 popSizenumGens大小为100时大约需要一分钟,但即使这样,也能找到解决方案(在我运行它的时候)。

答案 8 :(得分:1)

有趣的限制!我可能推翻了这个,解决了一个更普遍的问题:改组任意序列列表,以便(如果可能的话)没有两个相邻的序列共享第一个项目。

from itertools import product
from random import choice, randrange, shuffle

def combine(*sequences):
    return playlist(product(*sequences))

def playlist(sequence):
    r'''Shuffle a set of sequences, avoiding repeated first elements.
    '''#"""#'''
    result = list(sequence)
    length = len(result)
    if length < 2:
        # No rearrangement is possible.
        return result
    def swap(a, b):
        if a != b:
            result[a], result[b] = result[b], result[a]
    swap(0, randrange(length))
    for n in range(1, length):
        previous = result[n-1][0]
        choices = [x for x in range(n, length) if result[x][0] != previous]
        if not choices:
            # Trapped in a corner: Too many of the same item are left.
            # Backtrack as far as necessary to interleave other items.
            minor = 0
            major = length - n
            while n > 0:
                n -= 1
                if result[n][0] == previous:
                    major += 1
                else:
                    minor += 1
                if minor == major - 1:
                    if n == 0 or result[n-1][0] != previous:
                        break
            else:
                # The requirement can't be fulfilled,
                # because there are too many of a single item.
                shuffle(result)
                break

            # Interleave the majority item with the other items.
            major = [item for item in result[n:] if item[0] == previous]
            minor = [item for item in result[n:] if item[0] != previous]
            shuffle(major)
            shuffle(minor)
            result[n] = major.pop(0)
            n += 1
            while n < length:
                result[n] = minor.pop(0)
                n += 1
                result[n] = major.pop(0)
                n += 1
            break
        swap(n, choice(choices))
    return result

这开始很简单,但是当它发现它无法找到具有不同第一个元素的项目时,它会计算出需要将该元素与其他元素交错所需的距离。因此,主循环最多遍历数组三次(一次向后),但通常只运行一次。当然,第一个前向传递的每次迭代都会检查数组中的每个剩余项,并且数组本身包含每一对,因此总运行时间为O((NM)**2)

针对您的具体问题:

>>> X = Y = [1, 2, 3, 4, 5]
>>> combine(X, Y)
[(3, 5), (1, 1), (4, 4), (1, 2), (3, 4),
 (2, 3), (5, 4), (1, 5), (2, 4), (5, 5),
 (4, 1), (2, 2), (1, 4), (4, 2), (5, 2),
 (2, 1), (3, 3), (2, 5), (3, 2), (1, 3),
 (4, 3), (5, 3), (4, 5), (5, 1), (3, 1)]

顺便说一下,这会通过相等比较x值,而不是X数组中的位置,如果数组可以包含重复项,则可能会有所不同。实际上,如果超过一半的X值相同,重复值可能会触发将所有对混乱的后备情况。