概念:从“单词”列表中收集“同义词”

时间:2013-12-03 12:28:00

标签: python algorithm conceptual

这个问题的灵感来自:Generating a list of repetitions regardless of the order及其接受的答案:https://stackoverflow.com/a/20336020/1463143

这个问题的问题没有明确表达,所以请光临我。

这里,“字母表”是任何一组字母,例如'012'或'EDCRFV'

通过在字母表上做笛卡尔积来获得“单词”。我们应该能够指定n来获取n个字母的单词。例如:

from itertools import product
alphabet = '012'
wordLen = 3
wordList = [''.join(letter) for letter in product(alphabet,repeat=wordLen)]
print wordList

给出:

['000', '001', '002', '010', '011', '012', '020', '021', '022', '100', '101', '102', '110', '111', '112', '120', '121', '122', '200', '201', '202', '210', '211', '212', '220', '221', '222']

“同义词”是通过......呃......只要我能表达出来......

这些列表包含wordList中所有可能的“同义词”:

['000',
 '111',
 '222'] 

['001',
 '002',
 '110',
 '112',
 '220',
 '221']

['010',
 '020',
 '101',
 '121',
 '202',
 '212']

['011',
 '022',
 '100',
 '122',
 '200',
 '211']

['012',
 '021',
 '102',
 '120',
 '201',
 '210']
遗憾的是,我无法清楚地说明我是如何获得上述“同义词”的。我想像上面那样做一些形成n字母词的任意字母。

3 个答案:

答案 0 :(得分:3)

看起来很简单:

syns = collections.defaultdict(list)

for w in wordList:
    hash = tuple(w.index(c) for c in w)
    syns[hash].append(w)

print syns.values()

答案 1 :(得分:1)

A:

[ word for word in wordList 
    if  word[0] == word[1]
    and word[0] == word[2] ]

B:

[ word for word in wordList 
    if  word[0] == word[1]
    and word[0] != word[2] ]

C:

[ word for word in wordList 
    if  word[0] != word[1]
    and word[0] == word[2] ]

d

[ word for word in wordList 
    if  word[0] != word[1]
    and word[1] == word[2] ]

E:

[ word for word in wordList 
    if  word[0] != word[1]
    and word[0] != word[2] ]

所以,它的所有等式字母变体的组合在一起:
'abc' - > a<> b,b = c,c<> a; a = b,b = c,c = a;等等。

排除每个空结果(例如:<> b,b = c,c = a)

答案 2 :(得分:0)

似乎你想要的规则(对于更大的n)如下:

单词uv同义词 iff u可以通过交换v获取字母表中的两个字符,即从所有字母表排列中获得的所有单词都是同义词。

实施例: 设u = 001,字母为012

字母表中有六种排列:'012', '021', '102', '120', '201', '210'。使用所有这些排列映射u以获取u的同义词:

'001'
'002'
'110'
'112'
'220'
'221'