我有一个文本文件,其条目如下所示:
JohnDoe
Assignment 9
Reading: NO
header: NO
HW: NO
Solutions: 0
show: NO
Journals: NO
free: NO
Finished: NO
Quiz: 0
Done
Assignment 3
E-book: NO
HW: NO
Readings: NO
Show: 0
Journal: NO
Study: NO
Test: NO
Finished: NO
Quiz: 0
Done
这是一个小样本。该文件中有几名学生。每个学生在他们的名下有两个作业,只有在每个作业中以“已完成”开头的行为“已完成:是”时才会通过。每个作业下的所有数据都是杂乱无章的,但是在每个作业的某个地方,一行会说“完成:是(或否)”我需要一种方法来阅读文件并说明是否有任何学生通过。到目前为止,我有
def get_entries( file ):
with open( "dicrete.txt.rtf", 'rt') as file:
for line in file:
if "Finished" in line:
finished, answer = line.split(':')
yield finished, answer
# dict takes a sequence of `(key, value)` pairs and turns in into a dict
print dict(get_entries( file ))
我只能得到这个代码来返回一个条目(第一个“完成”它作为键读取,“YES或NO”作为值,这是我想要的,但我希望它返回文件中的每一行以“完成”开头。所以我提供的样本数据我想返回一个包含2个条目的字典{完成:“否”,完成:“否”}
答案 0 :(得分:2)
字典每个键只能存储一个映射。因此,您永远不会拥有一个字典,该字典对同一个键有两个不同的条目。
请考虑使用两元组列表,例如[("Finished", "NO"), ("Finished", "NO")]
。
答案 1 :(得分:0)
听起来你需要一个更好的数据模型!让我们来看看,我们呢?
我们可以使用Assignment
和Assignment: #
之间的所有文字行来定义Finished: YES/NO
类。
class Assignment(object):
def __init__(self, id, *args, **kwargs):
self.id = id
for key,val in kwargs.items():
setattr(self, key.lower(), val)
finished = getattr(self, 'finished', None)
if finished is None:
raise AttributeError("All assignments must have a 'finished' value")
else:
self.finished = True if finished.lower() == "yes" else False
@classmethod
def from_string(cls, s):
"""Builds an Assignment object from a string
a = Assignment.from_string('''Assignment: 1\nAttributes: Go Here\nFinished: yes''')
>>> a.id
1
>>> a.finished
True"""
d = dict()
id = None
for line in s.splitlines():
key,*val = map(str.strip, line.split(":"))
val = ' '.join(val) or None
if key.lower().startswith('assignment'):
id = int(key.split()[-1])
continue
d[key.lower()] = val
if id is not None:
return cls(id, **d)
else:
raise ValueError("No 'Assignment' field in string {}".format(s))
拥有模型后,您需要解析输入。幸运的是,这实际上很简单。
def splitlineson(s, sentinel):
"""splits an iterable of strings into a newline separated string beginning with each sentinel.
>>> s = ["Garbage", "lines", "SENT$", "first", "group", "SENT$", "second", "group"]
>>> splitlineson(s, "SENT$")
iter("SENT$\nfirst\ngroup",
"SENT$\nsecond\ngroup")"""
lines = []
for line in s:
if line.lower().strip().startswith(sentinel.lower()):
if any((sentinel.lower() in line.lower() for line in lines)):
yield "\n".join(lines)
lines = [line.strip()]
else:
if line:
lines.append(line.strip())
yield "\n".join(lines)
with open('path/to/textfile.txt') as inf:
assignments = splitlineson(inf, "assignment ")
assignment_list = [Assignment.from_string(a) for a in assignments]