Python-为配对创造一切可能性

时间:2019-10-23 16:08:38

标签: python

我想用成对的字符创造一切可能性。

示例

输入:

"ab"

输出:

[["AABB", "AAAb", "AaBB", "AaBb"],
 ["AABb", "AAbb", "AaBb", "Aabb"],
 ["AaBB", "AaBb", "aaBB", "aaBb"],
 ["AaBb", "Aabb", "aaBb", "aabb"]]

基本上是这样的:

https://www.frustfrei-lernen.de/images/biologie/mendel-4.jpg.

我尝试使用itertools,但是对于这种方法,我有很多可能(例如,“ aaaa”,但我不希望这样”)。这是我尝试过的事情:

import itertools

def generate_product(l):
    yield from itertools.product(*([l] * len(l)))


characters = str(input("Gib deine Merkmale in Kleinbuchstaben ein. -> ")) # Get Input

splitted_characters = list(characters) # Split into list with chars
characters_list = splitted_characters + [a.upper() for a in splitted_characters] # create uppercase and lowercase chars


types = []
for x in generate_product(characters_list):
    types.append(["".join(x)])

for table in types:
    print(table)

对于abc,其开头为:

aabbcc

对于a,它是:

[["AA", "Aa"], ["Aa", "aa"]]

4 个答案:

答案 0 :(得分:3)

您可以很好地使用itertools.product,但是您需要正确定义输入可迭代项。您要遍历每个大写/小写对两次。对于2x2的示例,您需要

itertools.product('Aa', 'Aa', 'Bb', 'Bb')

由于这是一个遗传问题,所以您可以将其视为每个基因的可能性循环,并为每个亲本的每个基因重复。这样说来的好处是,如果一个亲本具有不同的基因型(不是杂合的),则可以很容易地表达出来。例如,类似:

itertools.product('AA', 'Aa', 'BB', 'bb')

根据结果运行collections.Counter将有助于您计算后代基因型的统计量。

但是问题仍然在于如何使用itertools执行此操作。 Repeating the elements of an iterable N times可以通过itertools.chain.from_iterableitertools.repeat来实现:

itertools.chain.from_iterable(itertools.repeat(x, 2) for x in ('Aa', 'Bb'))

生成的迭代器可以直接传递到itertools.product中:

from itertools import chain, product, repeat

def table_entries(genes):
    possibilities = product(*chain.from_iterable(repeat((g.upper(), g.lower()), 2) for g in genes))
    return [''.join(possibility) for possibility in possibilities]

这适用于任意数量的基因,无论您的原始大小写如何:

>>> table_entries('ab')
['AABB',
 'AABb',
 'AAbB',
 'AAbb',
 'AaBB',
 'AaBb',
 'AabB',
 'Aabb',
 'aABB',
 'aABb',
 'aAbB',
 'aAbb',
 'aaBB',
 'aaBb',
 'aabB',
 'aabb']
>>> table_entries('AbC')
['AABBCC',
 'AABBCc',
 'AABBcC',
 'AABBcc',
 'AABbCC',
 ...
 'aabBcc',
 'aabbCC',
 'aabbCc',
 'aabbcC',
 'aabbcc']

答案 1 :(得分:2)

再次编辑

因此,由于我的输出不是100%准确,所以我意识到itertools.product实际上不太适合此任务。因此,我实现了一个可按需完成工作的功能:

from itertools import product 

def pretty_product(x,y):
    x,y,X,Y = x.lower(),y.lower(),x.upper(), y.upper()
    L = [X+Y, X+y, x+Y, x+y]
    E = [] 
    for i in product(L, L):
        f, s = i
        r = f[0]+s[0] if f[0]<s[0] else s[0]+f[0]
        r += f[1]+s[1] if f[1]<s[1] else s[1]+f[1]
        E.append(r)
    return E        

letters = 'ab' 

print([pretty_product(letters[0], letters[1])[n:n+4]for n in range(0,16,4)]) 

可打印

[['AABB', 'AABb', 'AaBB', 'AaBb'], ['AABb', 'AAbb', 'AaBb', 'Aabb'], ['AaBB', 'AaBb', 'aaBB', 'aaBb'], ['AaBb', 'Aabb', 'aaBb', 'aabb']]

完全按照要求。

答案 2 :(得分:1)

this answer借用并根据您的用例进行调整,您可以在其中只输入单个字符(ababc等),确保我们保持{的顺序{1}},然后按照标准分布将列表分为punnett square,这样

AaBbCcDd...

AA, Aa
Aa, aa

将始终按照正确的顺序和象限输出(对于任何大小的输入,都遵循以下顺序)。

AABB, AABb, AaBB, AaBb
AABb, AAbb, AaBb, Aabb
AaBB, AaBb, aaBB, aaBb
AaBb, Aabb, aaBb, aabb

因此输出(加上新行使其看起来像标准矩阵):

def punnett(ins):
    # first get the proper order of AaBbCc... based on input order not alphabetical
    order = ''.join(chain.from_iterable((x.upper(), x.lower()) for x in ins))
    # now get your initial square output by sorting on the index of letters from order
    # and using a lot of the same logic as other answers (and the linked source)
    ps = [''.join(sorted(''.join(e), key=lambda word: [order.index(c) for c in word]))
        for e in product(*([''.join(e) for e in product(*e)]
                    for e in zip(
                        [list(v) for _, v in groupby(order, key = str.lower)], 
                        [list(v) for _, v in groupby(order, key = str.lower)])))]
    outp = set()
    outx = []
    # Now to get your quadrants you need to do numbers
    #    from double the length of the input
    #    to the square of the length of that double
    for x in range(len(ins)*2, (len(ins)*2)**2, len(ins)):
        # using this range you need the numbers from your x minus your double (starting 0)
        # to your x minus the length
        # and since you are iterating by the length then will end up being your last x
        # Second you need starting at x and going up for the length
        # so for input of length 2 -> x=4 -> 0, 1, 4, 5
        # and next round -> x=6 -> 2, 3, 6, 7
        temp = [i for i in range(x - len(ins)*2, x - len(ins))] + [i for i in range(x, x+len(ins))]
        # and now since we need to never use the same index twice, we check to make sure none 
        # have been seen previously
        if all(z not in outp for z in temp):
            # use the numbers as indexes and put them into your list
            outx.append([ps[i] for i in temp])
            # add each individually to your set to check next time if we have seen it
            for z in temp:
                outp.add(z)
    return outx

在这段代码中肯定会提高效率,使其更短,并且您可以根据需要使用其他发布者的任何方法来生成初始的>>> punnett('ab') [['AABB', 'AABb', 'AaBB', 'AaBb'], ['AABb', 'AAbb', 'AaBb', 'Aabb'], ['AaBB', 'AaBb', 'aaBB', 'aaBb'], ['AaBb', 'Aabb', 'aaBb', 'aabb']] >>> punnett('abc') [['AABBCC', 'AABBCc', 'AABBCc', 'AABbCc', 'AABbcc', 'AABbCC'], ['AABBcc', 'AABbCC', 'AABbCc', 'AABbCc', 'AABbCc', 'AABbcc'], ['AAbbCC', 'AAbbCc', 'AAbbCc', 'AaBBCc', 'AaBBcc', 'AaBbCC'], ['AAbbcc', 'AaBBCC', 'AaBBCc', 'AaBbCc', 'AaBbCc', 'AaBbcc'], ['AaBbCC', 'AaBbCc', 'AaBbCc', 'AabbCc', 'Aabbcc', 'AaBBCC'], ['AaBbcc', 'AabbCC', 'AabbCc', 'AaBBCc', 'AaBBCc', 'AaBBcc']] ,并假设生成顺序相同。您仍然需要对他们应用ps方法以获取所需的输出。

答案 3 :(得分:0)

您必须像这样嵌套itertools.product

list(map(''.join, product(*(map(''.join, product((c.upper(), c), repeat=2)) for c in 'ab'))))

这将返回:

['AABB',
 'AABb',
 'AAbB',
 'AAbb',
 'AaBB',
 'AaBb',
 'AabB',
 'Aabb',
 'aABB',
 'aABb',
 'aAbB',
 'aAbb',
 'aaBB',
 'aaBb',
 'aabB',
 'aabb']