查找作为字谜

时间:2016-09-08 19:05:44

标签: python string anagram

这个问题涉及this problem on lintcode。我有一个可行的解决方案,但是对于庞大的测试用例来说需要很长时间。我想知道如何改进?也许我可以减少在外循环中进行的比较次数。

class Solution:
    # @param strs: A list of strings
    # @return: A list of strings
    def anagrams(self, strs):
        # write your code here
        ret=set()
        for i in range(0,len(strs)):
            for j in range(i+1,len(strs)):
                if i in ret and j in ret:
                    continue
                if Solution.isanagram(strs[i],strs[j]):
                    ret.add(i)
                    ret.add(j)

        return [strs[i] for i in list(ret)]


    @staticmethod
    def isanagram(s, t):
        if len(s)!=len(t):
            return False
        chars={}
        for i in s:
            if i in chars:
                chars[i]+=1
            else:
                chars[i]=1

        for i in t:
            if i not in chars:
                return False
            else:
                chars[i]-=1
                if chars[i]<0:
                    return False

        for i in chars:
            if chars[i]!=0:
                return False
        return True

更新:添加,而不是寻找内置的pythonic解决方案,例如使用已经优化的Counter。添加了Mike的建议,但仍然超出了时间限制。

7 个答案:

答案 0 :(得分:3)

跳过已放置在集合中的字符串。别再测试一下。

# @param strs: A list of strings
# @return: A list of strings
def anagrams(self, strs):
    # write your code here
    ret=set()
    for i in range(0,len(strs)):
        for j in range(i+1,len(strs)):

            # If both anagrams exist in set, there is no need to compare them.
            if i in ret and j in ret:
                continue

            if Solution.isanagram(strs[i],strs[j]):
                ret.add(i)
                ret.add(j)

    return [strs[i] for i in list(ret)]

在迭代字母之前,您还可以在字谜测试中进行长度比较。每当琴弦的长度不同时,它们无论如何都不能成为字谜。此外,当比较t中的值时,chars中的计数器达到-1时,只返回false。不要再次遍历chars

@staticmethod
def isanagram(s, t):
    # Test strings are the same length
    if len(s) != len(t):
        return False

    chars={}
    for i in s:
        if i in chars:
            chars[i]+=1
        else:
            chars[i]=1

    for i in t:
        if i not in chars:
            return False
        else:
            chars[i]-=1
            # If this is below 0, return false
            if chars[i] < 0:
                return False

    for i in chars:
        if chars[i]!=0:
            return False
    return True

答案 1 :(得分:2)

您可以创建一个字典(或collections.defaultdict),将每个字母计数映射到具有这些字数的单词,而不是比较所有字符串对。要获得字母数,您可以使用collections.Counter。之后,你只需从该字典中获取值。如果您希望所有单词都是任何其他单词的字谜,请只合并包含多个条目的列表。

querySelectorAll

当然,如果您不想使用内置功能,只需使用常规getImportedTemplate() { const imports = document.querySelectorAll('link[rel=import]'); return Array.from(imports).map( (link) => { return link.import.querySelector('#myTemplate'); }).filter( (val) => { return val !== null; })[0]; } createdCallback() { var imported = this.getImportedTemplate(); var content = imported.content; this.appendChild(document.importNode(content, true)); } 代替querySelector并使用常规strings = ["cat", "act", "rat", "hut", "tar", "tact"] anagrams = defaultdict(list) for s in strings: anagrams[frozenset(Counter(s).items())].append(s) print([v for v in anagrams.values()]) # [['hut'], ['rat', 'tar'], ['cat', 'act'], ['tact']] print([x for v in anagrams.values() if len(v) > 1 for x in v]) # ['cat', 'act', 'rat', 'tar'] 编写自己的dict,类似于defaultdict方法中的内容,只是没有比较部分。

答案 2 :(得分:1)

作为@Mike的最佳答案的补充,这是一个很好的Pythonic方法:

import collections


class Solution:
    # @param strs: A list of strings
    # @return: A list of strings
    def anagrams(self, strs):
        patterns = Solution.find_anagram_words(strs)
        return [word for word in strs if ''.join(sorted(word)) in patterns]

    @staticmethod
    def find_anagram_words(strs):
        anagrams = collections.Counter(''.join(sorted(word)) for word in strs)
        return {word for word, times in anagrams.items() if times > 1}

答案 3 :(得分:1)

你的解决方案很慢,因为你没有利用python的数据结构。

这是一个在dict中收集结果的解决方案:

class Solution:
    def anagrams(self, strs):
        d = {}
        for word in strs:
            key = tuple(sorted(word))
            try:
                d[key].append(word)
            except KeyError:
                d[key] = [word]
        return [w for ws in d.values() for w in ws if len(ws) > 1]

答案 4 :(得分:0)

为什么不呢?

str1 = "cafe"
str2 = "face"
def isanagram(s1,s2):
    return all(sorted(list(str1)) == sorted(list(str2)))

if isanagram(str1, str2):
    print "Woo"

答案 5 :(得分:0)

如果在C#中使用Linq,则只需一行代码即可完成

string [] = strs; //输入字符串数组

var结果= strs.GroupBy(x =>新字符串(x.ToCharArray()。OrderBy(z => z).ToArray()))。Select(g => g.ToList())。ToList( );

答案 6 :(得分:0)

现在要在 Python 中对字谜进行分组,我们必须: 对列表进行排序。然后,创建一个字典。现在字典会告诉我们这些字谜在哪里(字典索引)。那么字典的值就是字谜的实际索引。


def groupAnagrams(words):
 
    # sort each word in the list
    A = [''.join(sorted(word)) for word in words]
    dict = {}
    for indexofsamewords, names in enumerate(A):
     dict.setdefault(names, []).append(indexofsamewords)
    print(dict)
    #{'AOOPR': [0, 2, 5, 11, 13], 'ABTU': [1, 3, 4], 'Sorry': [6], 'adnopr': [7], 'Sadioptu': [8, 16], ' KPaaehiklry': [9], 'Taeggllnouy': [10], 'Leov': [12], 'Paiijorty': [14, 18], 'Paaaikpr': [15], 'Saaaabhmryz': [17], ' CNaachlortttu': [19], 'Saaaaborvz': [20]}
 
    for index in dict.values():
     print([words[i] for i in index])
 

if __name__ == '__main__':
 
    # list of words
    words = ["ROOPA","TABU","OOPAR","BUTA","BUAT" , "PAROO","Soudipta",
        "Kheyali Park", "Tollygaunge", "AROOP","Love","AOORP", "Protijayi","Paikpara","dipSouta","Shyambazaar",
        "jayiProti", "North Calcutta", "Sovabazaar"]
 
    groupAnagrams(words)

输出:


['ROOPA', 'OOPAR', 'PAROO', 'AROOP', 'AOORP']
['TABU', 'BUTA', 'BUAT']
['Soudipta', 'dipSouta']
['Kheyali Park']
['Tollygaunge']
['Love']
['Protijayi', 'jayiProti']
['Paikpara']
['Shyambazaar']
['North Calcutta']
['Sovabazaar']