我试图实现一个名为CharCounter的迭代器类。此类打开一个文本文件,并提供一个迭代器,该文本文件返回包含用户指定字符数的文本文件中的单词。它应该每行输出一个单词。这不是它正在做什么,它将这些单词作为列表输出,然后它不断输出' a'。我该如何修复我的代码?
class CharCounter(object):
def __init__(self, fileNm, strlen):
self._fileNm = fileNm
self._strlen = strlen
fw = open(fileNm)
text = fw.read()
lines = text.split("\n")
words = []
pwords =[]
for each in lines:
words += each.split(" ")
chkEnd = ["'",'"',",",".",")","("]
if words[-1] in chkEnd:
words = words.rstrip()
for each in words:
if len(each) == strlen:
pwords.append(each)
print(pwords)
def __iter__(self):
return CharCounterIterator(self._fileNm)
class CharCounterIterator(object):
def __init__(self,fileNm):
self._fileNm = fileNm
self._index = 0
def __iter__(self):
return self
def next(self):
try:
ret = self._fileNm[self._index]
return ret
except IndexError:
raise StopIteration
if __name__=="__main__":
for word in CharCounter('agency.txt',11):
print "%s" %word
答案 0 :(得分:0)
在SO上发布的代码不应该读取文件,除非问题是关于读取文件。结果无法复制和验证。 (参见MCVE。)而是将文本字符串定义为文件的替身。
您的代码会将长度为n的单词打印为列表,因为这是您要求它对print(pwords)
执行的操作。它会重复打印文件名的第一个字符,因为这是您要求它在__next__
方法中执行的操作。
您的班级__init__
比您描述的更多。试图从单词中删除标点符号并不起作用。下面的代码定义了一个类,它将文本转换为剥离的单词列表(带有重复项)。它还定义了一个过滤单词列表的参数化生成器方法。
class Words:
def __init__(self, text):
self.words = words = []
for line in text.split('\n'):
for word in line.split():
words.append(word.strip(""",'."?!()[]{}*$#"""))
def iter_n(self, n):
for word in self.words:
if len(word) == n:
yield word
# Test
text = """
It should output a word per line.
Which is not what's it's doing!
(It outputs the words as a [list] and then continuously outputs 'a'.)
How can I fix my #*!code?
"""
words = Words(text)
for word in words.iter_n(5):
print(word)
# Prints
Which
doing
words