找到所有字符与python中的其他单词匹配的单词

时间:2011-02-05 11:49:42

标签: python

像umbellar = umbrella这两个词都是平等的。

输入= [“umbellar”,“goa”,“umbrella”,“before”,“aery”,“alem”,“ayre”,“gnu”,“eyra”,“egma”,“game”, “leam”,“amel”,“year”,“meal”,“yare”,“gun”,“alme”,“ung”,“male”,“lame”,“mela”,“mage”]

所以输出应该是:

输出= [         [ “umbellar”, “伞”],         [ “前”, “果阿”],         [ “丙烯酸酯”, “艾尔”, “eyra”, “亚热”, “年”],         [ “阿莱姆”, “ALME”, “AMEL”, “跛脚”, “易学”, “男”, “用餐”, “梅拉”],         [ “GNU”, “枪”, “UNG”]         [ “EGMA”, “游戏”, “法师”],       ]

5 个答案:

答案 0 :(得分:7)


from itertools import groupby

def group_words(word_list):
    sorted_words = sorted(word_list, key=sorted)
    grouped_words = groupby(sorted_words, sorted)
    for key, words in grouped_words:
        group = list(words)
        if len(group) > 1:
            yield group

示例:

>>> group_words(["umbellar","goa","umbrella","ago","aery","alem","ayre","gnu","eyra","egma","game","leam","amel","year","meal","yare","gun","alme","ung","male","lame","mela","mage" ])
<generator object group_words at 0x0297B5F8>
>>> list(_)
[['umbellar', 'umbrella'], ['egma', 'game', 'mage'], ['alem', 'leam', 'amel', 'meal', 'alme', 'male', 'lame', 'mela'], ['aery', 'ayre', 'eyra', 'year', 'yare'], ['goa', 'ago'], ['gnu', 'gun', 'ung']]

答案 1 :(得分:4)

他们不是平等的话,他们是字谜。

可以通过按字符排序找到字谜:

sorted('umbellar') == sorted('umbrella')

答案 2 :(得分:1)

collections.defaultdict派上用场:

from collections import defaultdict

input = ["umbellar","goa","umbrella","ago","aery","alem","ayre","gnu",
"eyra","egma","game","leam","amel","year","meal","yare","gun",
"alme","ung","male","lame","mela","mage" ]

D = defaultdict(list)
for i in input:
    key = ''.join(sorted(input))
    D[key].append(i)

output = D.values()

输出为[['umbellar', 'umbrella'], ['goa', 'ago'], ['gnu', 'gun', 'ung'], ['alem', 'leam', 'amel', 'meal', 'alme', 'male', 'lame', 'mela'], ['egma', 'game', 'mage'], ['aery', 'ayre', 'eyra', 'year', 'yare']]

答案 3 :(得分:0)

正如其他人指出你正在寻找你的单词列表中的所有字谜组。在这里你有一个可能的解决方案该算法寻找候选者并选择一个(第一个元素)作为规范词,将其余部分删除为可能的词,因为字谜是可传递的,一旦你发现一个词属于anagram组,你就不需要再次重新计算它。 / p>

input = ["umbellar","goa","umbrella","ago","aery","alem","ayre","gnu",
"eyra","egma","game","leam","amel","year","meal","yare","gun",
"alme","ung","male","lame","mela","mage" ]
res = dict()
for word in input:
    res[word]=[word]
for word in input:
    #the len test is just to avoid sorting and comparing words of different len
    candidates = filter(lambda x: len(x) == len(word) and\
                                  sorted(x) == sorted(word),res.keys())
    if len(candidates):
        canonical = candidates[0]
        for c in candidates[1:]:
            #we delete all candidates expect the canonical/
            del res[c]
            #we add the others to the canonical member
            res[canonical].append(c)
print res.values()

这个algth输出......

[['year', 'ayre', 'aery', 'yare', 'eyra'], ['umbellar', 'umbrella'],
 ['lame', 'leam', 'mela', 'amel', 'alme', 'alem', 'male', 'meal'],
 ['goa', 'ago'], ['game', 'mage', 'egma'], ['gnu', 'gun', 'ung']]

答案 4 :(得分:0)

尚的答案是正确的......但是我一直在挑战做同样的事情而不使用....'groupby()'....... 这里是..... 添加print语句将帮助您调试代码和运行时输出....

def group_words(word_list):
    global new_list
    list1 = [] 
    _list0 = []
    _list1 = []
    new_list = []
    for elm in word_list:
        list_elm = list(elm)
        list1.append(list(list_elm))
    for ee in list1:
        ee = sorted(ee)
        ee = ''.join(ee)
        _list1.append(ee)   
    _list1 = list(set(_list1))
    for _e1 in _list1:
        for e0 in word_list:
            if  len(e0) == len(_e1):
                list_e0 = ''.join(sorted(e0))
                if _e1 == list_e0:
                    _list0.append(e0)
                    _list0 = list(_list0)
        new_list.append(_list0)
        _list0 = []
    return new_list

,输出

[['umbellar', 'umbrella'], ['goa', 'ago'], ['gnu', 'gun', 'ung'], ['alem', 'leam', 'amel', 'meal', 'alme', 'male', 'lame', 'mela'], ['egma', 'game', 'mage'], ['aery', 'ayre', 'eyra', 'year', 'yare']]