Question

编写一个名为remove_duplicates的函数，它将采用一个名为string的参数。此string输入将only have characters between a-z。该函数应删除字符串中的所有重复字符，并返回具有两个值的元组：

只包含唯一排序字符的新字符串。
删除的重复项总数。

例如：

remove_duplicates('aaabbbac')应生成('abc')
remove_duplicates('a')应生成('a', 0)
remove_duplicates('thelexash')应生成('aehlstx', 2)

我的代码：

    def remove_duplicates(string):

        for string in "abcdefghijklmnopqrstuvwxyz":

            k = set(string)

            x = len(string) - len(set(string))

            return k, x

    print(remove_duplicates("aaabbbccc"))

预期产出：

我希望它打印({a, b, c}, 6)而不是打印({a}, 0)。

上面的代码出了什么问题？为什么它没有产生我期待的东西？

Answer 1

如果您不对字符串中的每个字符进行迭代，您将获得预期的结果。

我已经对您的代码进行了评论，因此您可以看到您的脚本与我的代码之间的区别。

非工作评论代码：

def remove_duplicates(string):

    #loop through each char in "abcdefghijklmnopqrstuvwxyz" and call it "string"
    for string in "abcdefghijklmnopqrstuvwxyz":

        #create variable k that holds a set of 1 char because of the loop
        k = set(string)

        # create a variable x that holds the difference between 1 and 1 = 0
        x = len(string) - len(set(string))

        #return these values in each iteration
        return k, x

print(remove_duplicates("aaabbbccc"))

<强>输出：

({'a'}, 0)

工作代码：

def remove_duplicates(string):

    #create variable k that holds a set of each unique char present in string
    k = set(string)

    # create a variable x that holds the difference between 1 and 1 = 0
    x = len(string) - len(set(string))

    #return these values
    return k, x

print(remove_duplicates("aaabbbccc"))

<强>输出：

({'b', 'c', 'a'}, 6)

P.s。：如果您希望结果有序，可以将return k, x更改为return sorted(k), x，但输出将是一个列表。

(['a', 'b', 'c'], 6)

编辑：如果您希望代码仅在满足特定条件时运行 - 例如，仅在字符串没有任何数字时运行 - 您可以添加if / else子句：

示例代码：

def remove_duplicates(s):

    if not s.isdigit():
        k = set(s)
        x = len(s) - len(set(s))
        return sorted(k), x
    else:
        msg = "This function only works with strings that doesn't contain any digits.."
        return msg


print(remove_duplicates("aaabbbccc"))
print(remove_duplicates("123123122"))

<强>输出：

(['a', 'b', 'c'], 6)
This function only works with strings that doesn't contain any digits..

Answer 2

您将从找到字符的第一个实例的函数返回。所以它返回第一个“a”。

请改为尝试：

def remove_duplicates(string):
    temp = set(string)
    return temp,len(string) - len(temp)


print(remove_duplicates("aaabbbccc"))

输出：

({'c', 'b', 'a'}, 6)

如果你想删除所有期望的字母（正如你在评论中提到的那样），试试这个：

def remove_duplicates(string):
    a= set()
    for i in string:
        if i.isalpha() and i not in a:
            a.add(i)
    return a,len(string) - len(a)

Answer 3

在您的代码中，函数将在迭代第一个字符后返回。由于string是指输入字符串中的第一个字符。我认为你试图逐个字符地迭代string变量。为此，您可以使用collections.Counter来更有效地执行相同的计算。

但是，我们可以使用替代解决方案，该解决方案不涉及计算给定字符串中每个字符的计数。

def remove_duplicates(s):
    unique_characters = set(s) # extract the unique characters in the given string
    new_sorted_string = ''.join(sorted(unique_characters)) # create the sorted string with unique characters
    number_of_duplicates = len(s) - len(unique_characters) # compute the number of duplicates in the original string
    return new_sorted_string, number_of_duplicates

Answer 4

def remove_duplicates(s):
    unique_characters = set(s) # extract the unique characters in the given 
string
    new_sorted_string = ''.join(sorted(unique_characters)) # create the sorted string with unique characters
    number_of_duplicates = len(s) - len(unique_characters) # compute the number of duplicates in the original string
    return new_sorted_string, number_of_duplicates

使用Python 3练习String

4 个答案: