如何在Python3中组合生成列表?

时间:2013-03-08 05:48:13

标签: python function

我需要创建一个从文本生成列表的函数:

text = '^to[by, from] all ^appearances[appearance]'

list = ['to all appearances', 'to all appearance', 'by all appearances', 
        'by all appearance', 'from all appearances', 'from all appearance']

也就是说,括号内的值应该替换在^之后的前一个单词。我希望函数有五个参数,如下所示......

我的代码(它不起作用)

def addSubstitution(buf, substitutions, val1='[', val2=']', dsym=',', start_p="^"):
    for i in range(1, len(buf), 2):
        buff = []
        buff.extend(buf)
        if re.search('''[^{2}]+[{0}][^{1}{0}]+?[{1}]'''.format(val1, val2, start_p,     buff[i]):
            substrs = re.split('['+val1+']'+'|'+'['+val2+']'+'|'+dsym, buff[i])
            for substr in substrs:
                if substr:
                    buff[i] = substr
                    addSubstitution(buff, substitutions, val1, val2, dsym, start_p)
        return
    substitutions.add(''.join(buf))
    pass

def getSubstitution(text, val1='[', val2=']', dsym=',', start_p="^"):
    pattern = '''[^{2}]+[{0}][^{1}{0}]+?[{1}]'''.format(val1, val2, start_p)
    texts = re.split(pattern,text)
    opttexts = re.findall(pattern,text)
    buff = []
    p = iter(texts)
    t = iter(opttexts)
    buf = []
    while True:
        try:
            buf.append(next(p))
            buf.append(next(t))
        except StopIteration:
            break
     substitutions = set()
     addSubstitution(buf, substitutions, val1, val2, dsym, start_p)
     substitutions = list(substitutions)
     substitutions.sort(key=len)
     return substitutions

1 个答案:

答案 0 :(得分:1)

一种方法可能是这个(我正在跳过字符串操作代码):

text = '^to[by, from] all ^appearances[appearance]'

第1步:像这样标记text

tokenizedText = ['^to[by, from]', 'all', '^appearances[appearance]']

步骤2:准备一份我们需要笛卡尔积的所有单词的列表(以^开头的单词)。

combinationList = []
for word in tokenizedText:
    if word[0] == '^': # split the words into a list, and add them to `combinationList`.

combinationList = [['to', 'by', 'from'], ['appearances', 'appearance']]

步骤3:使用itertools.product(...)

执行笛卡尔积
for substitution in itertools.product(*combinationList):
    counter = 0
    sentence = []
    for word in tokenizedInput:
        if word[0] == '^':
            sentence.append(substitution[counter])
            counter += 1
        else:
            sentence.append(word)
   print ' '.join(sentence)    # Or append this to a list if you want to return all substitutions.