我正在尝试找出如何读取没有空格的名称的字符串 例如robbybobby我希望它搜索字符串并将它们分成自己的组
def wordcount(filename, listwords):
try:
file = open(filename, "r")
read = file.readline()
file.close()
for word in listwords:
lower = word.lower()
count = 0
for letter in read:
line = letter.split()
for each in line:
line2 = each.lower()
line2 = line2.strip(".")
if lower == line2:
count += 1
print(lower, ":", count)
except FileExistsError:
print("no")
wordcount("teststring.txt", ["robby"])
使用此代码,只有在以后有空格的情况下,它才会发现罗比
答案 0 :(得分:0)
有几种方法可以做到这一点。我发布了2条建议,以便您可以理解和改进:)
解决方案1:
def count_occurrences(line, word):
# Normalize vars
word = word.lower()
line = line.lower()
# Initialize vars
start_index = 0
total_count = 0
word_len = len(word)
# Count ignoring empty spaces
while start_index >= 0:
# Ignore if not found
if word not in line[start_index:]:
break
# Search for the word starting from <start_index> index
start_index = line.index(word, start_index)
# Increment if found
if start_index >= 0:
start_index += word_len
total_count += 1
# Return total occurrences
return total_count
print(count_occurrences('stackoverflow overflow overflowABC over', 'overflow'))
输出:3
解决方案2:
如果您想使用正则表达式,此链接可能会有用:
答案 1 :(得分:0)
您想要对IIUC进行计数,而不考虑它是作为其他单词的一部分还是单独出现。
您可以为此使用简单的正则表达式:
import re
def count_line(dict, line, words):
for word in words:
dict[word]=len(re.findall(word, line, re.IGNORECASE))+dict.get(word, 0)
return dict
allLines="""
bobby robbubobby yo xyz\n
robson bobbyrobin abc\n
xyz bob amy oo\n
amybobson robson
"""
print(allLines)
words=["amy", "robby", "bobby", "jack"]
res={}
for line in allLines.split("\n"):
res=count_line(res, line, words)
print(res)
输出:
bobby robbubobby yo xyz
robson bobbyrobin abc
xyz bob amy oo
amybobson robson
{'amy': 2, 'robby': 0, 'bobby': 3, 'jack': 0}