假设我们有以下字符串列表:
Input: ["eat", "tea", "tan", "ate", "nat", "bat"]
我们的程序的输出应将每组字谜进行分组,并将它们作为列表一起返回,如下所示:
Output:
[
["ate","eat","tea"],
["nat","tan"],
["bat"]
]
我当前的解决方案找到了第一组字谜,但未能检测到其他两个字谜,而是将第一组字谜复制到列表中:
class Solution(object):
def groupAnagrams(self, strs):
allResults=[]
results=[]
temp=''
for s in strs:
temp=s[1:]+s[:1]
for i in range(0,len(strs)):
if temp==strs[i]:
results.append(strs[i])
allResults.append(results)
return allResults
,输出为:
[["ate","eat","tea"],["ate","eat","tea"],["ate","eat","tea"],["ate","eat","tea"],["ate","eat","tea"],["ate","eat","tea"]]
如何解决此问题?
编辑:
我通过在第二个循环之外将results
附加到allResults
中来解决了复制中的重复问题:
class Solution(object):
def groupAnagrams(self, strs):
allResults=[]
results=[]
temp=''
for s in strs:
temp=s[1:]+s[:1]
for i in range(0,len(strs)):
if temp==strs[i]:
results.append(strs[i])
allResults.append(results)
print(results)
return allResults
但是,它没有检测到另外两组七字组。
答案 0 :(得分:4)
您可以使用python内置集合库的defaultdict并进行排序:
In [1]: l = ["eat", "tea", "tan", "ate", "nat", "bat"]
In [2]: from collections import defaultdict
In [3]: d = defaultdict(list)
In [4]: for x in l:
...: d[str(sorted(x))].append(x)
In [5]: d.values()
Out[5]: dict_values([['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']])
为解决您的解决方案,您需要添加变量以检查是否已添加,例如(在遍历strs
的过程中,我使用枚举在搜索字谜中几乎没有表现):
class Solution(object): def groupAnagrams(self, strs): allResults = [] added = set([]) temp='' for i, s in enumerate(strs): results = [] unique_s = "".join(sorted(s)) if unique_s in added: continue else: added.add(unique_s) for x in strs[i:]: if unique_s=="".join(sorted(x)): results.append(strs[i]) allResults.append(results)
print(added)
return allResults
答案 1 :(得分:3)
使用itertools.groupby
>>> lst = ["eat", "tea", "tan", "ate", "nat", "bat"]
>>>
>>> from itertools import groupby
>>> f = lambda w: sorted(w)
>>> [list(v) for k,v in groupby(sorted(lst, key=f), f)]
[['bat'], ['eat', 'tea', 'ate'], ['tan', 'nat']]
答案 2 :(得分:0)
实现函数的方式,您只查看字符串的旋转(即,将字母从头到尾移动,例如a-t-e-> t-e-a-> e-a-t)。如果您仅切换两个字母(n-a-t-> t-a-n),则算法无法检测到单个排列。在数学语言中,您只考虑三个字母字符串的偶数排列,而不考虑奇数排列。
例如,对代码的修改可能是:
def get_list_of_permutations(input_string):
list_out = []
if len(input_string) > 1:
first_char = input_string[0]
remaining_string = input_string[1:]
remaining_string_permutations = get_list_of_permutations(remaining_string)
for i in range(len(remaining_string)+1):
for permutation in remaining_string_permutations:
list_out.append(permutation[0:i]+first_char+permutation[i:])
else:
return [input_string]
return list_out
def groupAnagrams(strs):
allResults=[]
for s in strs:
results = []
list_of_permutations = get_list_of_permutations(s)
for i in range(0,len(strs)):
if strs[i] in list_of_permutations:
results.append(strs[i])
if results not in allResults:
allResults.append(results)
return allResults
输出为
Out[218]: [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
编辑::修改了代码,使其可用于所有长度的字符串。
答案 3 :(得分:0)
第二行s_words
提取word
中每个words
的所有字母,对它们进行排序,然后重新创建由单词的排序字母组成的字符串;它会以与原始单词顺序相同的顺序创建所有这些排序字母字符串的列表->这将用于比较可能的字谜(字谜字母在排序时会产生相同的字符串)
第三行indices
保留True
或False
值,以指示相应的单词是否已被提取,并避免重复。
以下代码是一个双循环,对于每个s_word,确定其他s_word相同,并使用其索引来检索原始单词列表中的相应单词;它还会更新索引的真实值。
words = ["eat", "tea", "tan", "ate", "nat", "bat"]
s_words = [''.join(sorted(list(word))) for word in words]
indices = [False for _ in range(len(words))]
anagrams = []
for idx, s_word in enumerate(s_words):
if indices[idx]:
continue
ana = [words[idx]]
for jdx, word in enumerate(words):
if idx != jdx and not indices[jdx] and s_word == s_words[jdx]:
ana.append(words[jdx])
indices[jdx] = True
anagrams.append(ana)
print(anagrams)
[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
答案 4 :(得分:0)
https://docs.python.org/3/library/itertools.html#itertools.permutations
from itertools import permutations
word_list = ["eat", "tea", "tan", "ate", "nat", "bat"]
anagram_group_list = []
for word in word_list:
if word == None:
pass
else:
anagram_group_list.append([])
for anagram in permutations(word):
anagram = ''.join(anagram)
try:
idx = word_list.index(anagram)
word_list[idx] = None
anagram_group_list[-1].append(anagram)
except ValueError:
pass # this anagram is not present in word_list
print(anagram_group_list)
# [['eat', 'ate', 'tea'], ['tan', 'nat'], ['bat']]
在重构代码并阻止其产生冗余结果之后,由于产生字谜的逻辑并不完全正确,您的代码仍然无法给出预期的结果
def groupAnagrams(word_list):
allResults=[]
results=[]
for idx,s in enumerate(word_list):
if s == None:
pass
else:
results = [s] # word s is added to anagram list
# you were generating only 1 anagram like for tan --> ant but in word_list only nat was present
for i in range(1,len(s),1):
temp = s[i:]+s[:i] #anagram
# for s = 'tan' it generates only 'ant and 'nta'
# when it should generate all six tna ant nta _nat_ atn tan
if temp in word_list:
results.append(temp)
word_list[word_list.index(temp)] = None
allResults.append(results)
return allResults
print(groupAnagrams(["eat", "tea", "tan", "ate", "nat", "bat"]))
# [['eat', 'ate', 'tea'], ['tan'], ['nat'], ['bat']]