Question

我正在尝试解决此问题，因为它们给了我一组字符串，用于计算某个单词在诸如“ code”之类的字符串中出现的次数，但是程序还会计算“ d”发生变化的任何变体，例如'coze'，但是像'coz'这样的东西不算这就是我做的：

 def count(word):
  count=0
  for i in range(len(word)):
    lo=word[i:i+4]
    if lo=='co': # this is what gives me trouble
      count+=1
  return count

Answer 1

测试前两个字符是否匹配co，第四个字符是否匹配e。

def count(word):
  count=0
  for i in range(len(word)-3):
    if word[i:i+1] == 'co' and word[i+3] == 'e'
      count+=1
  return count

循环只会上升到len(word)-3，因此word[i+3]不会超出范围。

Answer 2

您可以通过re模块使用正则表达式。

import re
string = 'this is a string containing the words code, coze, and coz'
re.findall(r'co.e', string)
['code', 'coze']

从那里您可以编写一个函数，例如：

def count(string, word):
    return len(re.findall(word, string))

Answer 3

正则表达式是上述问题的答案，但您需要的是更完善的正则表达式模式。由于您要查找某些单词出现，因此需要搜索边界单词。因此，您的模式应该是……。像这样：

pattern = r'\bco.e\b'

通过这种方式，您的搜索将不会与testcodetest或cozetest之类的单词匹配，而只会与code coze coke匹配，而不是前导或后继字符

如果要进行多次测试，那么最好使用编译模式，这样可以提高内存效率。

In [1]: import re

In [2]: string = 'this is a string containing the codeorg testcozetest words code, coze, and coz'

In [3]: pattern = re.compile(r'\bco.e\b')

In [4]: pattern.findall(string)
Out[4]: ['code', 'coze']

希望有帮助。

子字符串变化很小

3 个答案: