从元组组合对列表中选择,以便每个元组元素至少出现两次

时间:2018-11-09 05:42:38

标签: python-3.x tuples unique combinations

假设我有一个元组列表,其元素都是列表中所有可能的配对:

matchup=[('Mike','John'),('Mike','Mary'),('Mike','Jane'),('John','Mary'),('John','Jane'),('Mary','Jane')...]

我想简化列表,以使每个人的名字出现两次,而不管他们是配对中的第一个元素还是第二个元素。如果不创建新对,则元组元素可以选择两次以上。

谢谢。

编辑: 最初是在列表中,我使用了for循环以随机ala将每个人与另一个人配对:

list=["John","Mike","Mary","Jane"]
pairing=[]
for person in list:
    for i in range(2):
        person2=random.sample(list(list),1)
        this_match=str(person)+str(person2)
        while this_match in pairing:
            person2=random.sample(list(list),1)
            this_match=str(person)+str(person2)
        pairing.append(this_match)

这导致同一个人重复。我的第二次尝试是:

from itertools import combinations
import pandas as pd
from collections import Counter

possible_games = combinations(list, 2)

games = list(possible_games)
dupe_check=Counter(games)
print(dupe_check)
print (games, len(games))

但是,我无法将每个元组的元素减少为尽可能接近两倍。

一个可能的输出可能看起来像:

[('Mike','John'),('Mike','Mary'),('John','Mary'),("Mary","Jane"),("Jane","Mike")]

约翰出现了两次。简出现两次。迈克(Mike)出现了三次,以使简(Jane)出现两次。玛丽出现了3次,简出现了2次。

2 个答案:

答案 0 :(得分:1)

以下代码将完全解决您的问题。 result将为您提供此代码的答案。

import itertools
import random
import numpy as np

# lst is a list of names that I have chosen.
lst = ['Apple', 'Boy', 'Cat', 'Dog', 'Eagle']

# create a list of tuples (pairs of names).
matchup = list(itertools.product(lst, lst)) 

# randomly shuffle the pairs of names.
random.shuffle(matchup)


def func(inp):
    out = []
    out += [ inp[0] ]

    # Unique array of names.
    unq = np.unique( (zip(*inp))[0] )

    # Stores counts of how many times a given name features in the final list of tuples.
    counter = np.zeros(len(unq))

    indx0 = np.where( out[0][0]==unq )[0][0]
    indx1 = np.where( out[0][1]==unq )[0][0]    
    counter[indx0]+=1
    counter[indx1]+=1    

    reserve = []

    #first try of filling output list with tuples so that no name enters the output list more than once.   
    for i in range(1,len(matchup)):
        tup = matchup[i]

        indx0 , indx1 = np.where(tup[0]==unq)[0][0], np.where(tup[1]==unq)[0][0]

        temp = counter.copy()

        temp[indx0]+=1
        temp[indx1]+=1

        if ( (temp[indx0]<=2) and (temp[indx1]<=2) ):
            out += [tup]
            counter[indx0]+=1
            counter[indx1]+=1

        else: reserve += [tup]     

    #A tuple element may be selected more than twice if it is not possible to create a new pair without doing so.    
    while(np.any(counter==1)):
        tup = reserve[0]

        indx0 , indx1 = np.where(tup[0]==unq)[0][0], np.where(tup[1]==unq)[0][0]

       # Create a copy of counter array. 
       temp = counter.copy()

        if ( (temp[indx0]<2) or (temp[indx1]<2) ):
            out += [tup]
            counter[indx0]+=1
            counter[indx1]+=1 

        reserve.pop(0)    

    return out  

result = func(matchup)
print (result)

result的输出在不同的运行中会有所不同,因为(名称的)元组列表在每次运行中都是随机排列的。结果的一个示例如下。

[('Cat', 'Dog'), ('Eagle', 'Boy'), ('Eagle', 'Dog'), ('Cat', 'Boy'), ('Apple', 'Apple')]      

答案 1 :(得分:1)

我想,两次准确地获得每个名字的最简单方法是:

lst = ["John", "Mike", "Mary", "Jane"]  # not shadowing 'list'

pairs = list(zip(lst, lst[1:]+lst[:1]))
pairs
# [('John', 'Mike'), ('Mike', 'Mary'), ('Mary', 'Jane'), ('Jane', 'John')]

这实际上是在列表上划圈,并将每个元素与它的两个邻居配对。如果需要更多随机性,则可以事先将列表随机播放或将列表分成几部分,然后将其应用于这些块。