Question

我在做leetcode问题No. 387. First Unique Character in a String。给定一个字符串，找到其中的第一个非重复字符并返回它的索引。如果它不存在，则返回-1。

示例：

s = "leetcode"
return 0.

s = "loveleetcode",
return 2.

我写了2个算法：

方法1

def firstUniqChar(s):
    d = {}
    L = len(s)
    for i in range(L):
        if s[i] not in d:
            d[s[i]] = [i]
        else:
            d[s[i]].append(i)
    M = L
    for k in d:
        if len(d[k])==1:
            if d[k][0]<M:
                M = d[k][0]
    if M<L:
        return M
    else:
        return -1

这非常直观，即首先通过循环遍历s中的所有字符来创建计数字典（这也可以使用collections.Counter中的一行完成），然后再进行第二次循环检查那些值为长度为1的列表的键。我认为当我做了2个循环时，它必须有一些冗余计算。所以我写了第二个算法，我认为它比第一个算法好，但是在leetcode平台上，第二个算法比第一个慢得多，我无法弄清楚原因。

方法2

def firstUniqChar(s):
    d = {}
    L = len(s)
    A = []
    for i in range(L):
        if s[i] not in d:
            d[s[i]] = i
            A.append(i)
        else:
            try:
                A.remove(d[s[i]])
            except:
                pass

    if len(A)==0:
       return -1
    else:
       return A[0]

第二个只为s

中的所有字符循环一次

Answer 1

您的第一个解决方案是O(n)，但您的第二个解决方案是O(n^2)，因为方法A.remove正在循环A的元素。

Answer 2

正如其他人所说 - 使用list.remove非常昂贵......使用collections.Counter是一个好主意。

您需要扫描字符串一次以查找唯一身份用户。然后可能更好的是再次顺序扫描并获取第一个唯一的索引 - 这将使您的潜在代码：

from collections import Counter

s = "loveleetcode"

# Build a set of unique values
unique = {ch for ch, freq in Counter(s).items() if freq == 1}
# re-iterate over the string until we first find a unique value or 
# not - default to -1 if none found
first_index = next((ix for ix, ch in enumerate(s) if ch in unique), -1)
# 2

为什么我的第二种方法比第一种方法慢？

2 个答案: