我正在尝试为我的编程类解决问题。我收到一个包含电子邮件和特殊文件的文件夹。特殊文件始终以“!”开头。我应该在Corpus类中添加一个方法emails()。该方法应该是一个发电机。这是它的使用示例:
corpus = Corpus('/path/to/directory/with/emails')
count = 0
# Go through all emails and print the filename and the message body
for fname, body in corpus.emails():
print(fname)
print(body)
print('-------------------------')
count += 1
print('Finished: ', count, 'files processed.')
这是我写的课程和方法:
class Corpus:
def __init__(self, path_to_mails_directory):
self.path_to_mails_directory = path_to_mails_directory
def emails(self):
iterator = 0
mail_body = None
mails_folder = os.listdir(self.path_to_mails_directory)
lenght = len(mails_folder)
while iterator <= lenght:
if not mails_folder[iterator].startswith("!"):
with open(self.path_to_mails_directory+"/"+mails_folder[iterator]) as an_e_mail:
mail_body = an_e_mail.read()
yield mails_folder[iterator], mail_body
iterator += 1
我尝试以这种方式运行示例代码:
if __name__ == "__main__":
my_corpus = Corpus("data/1")
my_gen = my_corpus.emails()
count = 0
for fname, body in my_gen:
print(fname)
print(body)
print("------------------------------")
count += 1
print("finished: " + str(count))
Python会按预期打印相当多的邮件(该文件夹包含大约一千个文件),然后继续:
Traceback (most recent call last):
File "C:/Users/tvavr/PycharmProjects/spamfilter/corpus.py", line 26, in <module>
for fname, body in my_gen:
File "C:/Users/tvavr/PycharmProjects/spamfilter/corpus.py", line 15, in emails
if not mails_folder[iterator].startswith("!"):
IndexError: list index out of range
我不知道问题是什么,并希望得到任何帮助。 THX
编辑:我根据你的建议更新了一些代码。
答案 0 :(得分:0)
这样做的好方法如下:
def emails(self):
mail_body = None
mails_folder = os.listdir(self.path_to_mails_directory)
for mail in mails_folder:
if mail.startswith("!"):
pass
else:
with open(self.path_to_mails_directory+"/"+mail) as an_e_mail:
mail_body = an_e_mail.read()
yield mail, mail_body
基于索引的迭代不被认为是Pythonic。您应该更喜欢“for mail in mails_folder:”语法。