Question

我的老师对我的挑战是，寻找一种方法来计算没有str.count()的任何随机字符串变量中单词“ bob”的出现。所以我做到了，

a = "dfjgnsdfgnbobobeob bob"
compteurDeBob = 0
for i in range (len(a) - 1):
   if a[i] == "b":
       if a[i+1] == "o":
           if a[i+2] == "b":
               compteurDeBob += 1
print(compteurDeBob)

但是我想找到一种方法来处理任何长度的单词，如下所示，但是我不知道该怎么做...

a = input("random string: ")
word = input("Wanted word: ")
compteurDeBob = 0
for i in range (len(a)-1):

   #... i don't know... 

print(compteurDeBob)

Answer 1

您可以使用字符串切片。修改代码的一种方法：

a = 'dfjgnsdfgnbobobeob bob'

counter = 0
value = 'bob'
chars = len(value)

for i in range(len(a) - chars + 1):
    if a[i: i + chars] == value:
        counter += 1

可以通过sum和生成器表达式来实现更简洁的编写方式：

counter = sum(a[i: i + chars] == value for i in range(len(a) - chars + 1))

之所以可行，是因为bool是Python中int的子类，即True / False的值分别被认为是1和0

请注意，str.count在这里不起作用，因为它only counts non-overlapping matches。如果允许内置，您可以利用str.find。

Answer 2

a = input("random string: ")
word = input("Wanted word: ")

count = 0
for i in range(len(a)-len(word)):
    if a[i:i+len(word)] == word:
        count += 1
print(count)

如果您希望搜索不区分大小写，则可以使用lower()函数：

a = input("random string: ").lower()
word = input("Wanted word: ").lower()

count = 0
for i in range(len(a)):
    if a[i:i+len(word)] == word:
        count += 1
print(count)

供用户输入

Hi Bob. This is bob

第一种方法将输出1，第二种方法将输出2

Answer 3

要计算所有重叠出现的次数（如您的示例中所示），您可以将字符串切成一个循环：

a = input("random string: ")
word = input("Wanted word: ")    
cnt = 0

for i in range(len(a)-len(word)+1):
    if a[i:i+len(word)] == word:
        cnt += 1

print(cnt)

Answer 4

计算重叠匹配的最快方法是Knuth-Morris-Pratt algorithm [wiki]，它在 O（m + n）中与 m 一起运行，要匹配的字符串，以及 n 字符串的大小。

该算法首先构建一个查找表，该查找表或多或少地用作对有限状态机（FSM）的描述。首先，我们使用以下方法构造此类表：

def build_kmp_table(word):
    t = [-1] * (len(word)+1)
    cnd = 0
    for pos in range(1, len(word)):
        if word[pos] == word[cnd]:
            t[pos] = t[cnd]
        else:
            t[pos] = cnd
            cnd = t[cnd]
            while cnd >= 0 and word[pos] != word[cnd]:
                cnd = t[cnd]
        cnd += 1
    t[len(word)] = cnd
    return t

那么我们可以依靠：

def count_kmp(string, word):
    n = 0
    wn = len(word)
    t = build_kmp_table(word)
    k = 0
    j = 0
    while j < len(string):
        if string[j] == word[k]:
            k += 1
            j += 1
            if k >= len(word):
                n += 1
                k = t[k]
        else:
            k = t[k]
            if k < 0:
                k += 1
                j += 1
    return n

上面的代码计算要搜索的字符串中 linear 时间中重叠的实例，这是对先前使用的“切片”方法的改进，该方法在 O（m× n）。

在不使用内置函数的情况下计算子字符串的出现

4 个答案: