Question

我的代码有问题。我有一个名为test.csv的.csv文件，其中包含3个句子和一个对我的句子中的每个单词进行计数的代码，它确定第一个字母和最后一个字母的数量，但是当我尝试下面的for循环时，它会计算特定单词，但是仅特定词，不计算其余句子。我想在保留位置数的同时打印出特定单词。

with open("test.csv") as e:
    text = e.read()

newtext = text.split()
words = '' 

currCount = 0 


for words in newtext:

    toAdd = len(words)
    if words == 'is':
        print ("("+str(currCount)+","+str(currCount+toAdd)+")"+ words)
    elif words != 'is':
        continue

    currCount+= toAdd+1

    if words is ".":
        currCount = 0

这是“ test.csv”中的句子。

my name is bob .
bob is my name .
my real name is lob .

输出：

Output                                   What i want
(0,2)is                                  (8,10)is
(3,5)is                                  (4,6)is
(6,8)is                                  (13,15)is

Answer 1

问题是这部分：

elif words != 'is':
    continue

这将跳过之后的所有内容-特别是递增currCount的部分-并直接继续循环的下一次迭代。您可能的意思是“在这种情况下，什么也不做”。如果要使其明确，可以使用pass代替continue。另外请注意，elif是多余的，因为该条件恰好是第一个条件的逆，因此您可以只使用else。

但是实际上您也可以完全删除这两行。

或者，您可以使用正则表达式来查找单词及其位置：

import re
with open("corpus.txt") as e:
    for line in e:
        for group in re.finditer(r"\bis\b", line):
            print(group.group(), group.span())

Answer 2

使用以下代码：

def FindPosition(String,word):
    return ([(a.start(), a.end()) for a in list(re.finditer(word, String))])

import re
aString = 'my name is bob.\nbob is my name.\nmy real name is lob .'
word = "is"
NewText = aString.split("\n")

for line in NewText:
    Match_List = FindPosition(line,word)
    if Match_List:
        for pos in Match_List:
            print(pos," ",word)

输出：

(8, 10)   is
(4, 6)   is
(13, 15)   is

如何在不丢失单词长度的情况下专门打印出一个单词

2 个答案: