我正在尝试读取txt文件并从文本中创建字典。示例txt文件是:
John likes Steak
John likes Soda
John likes Cake
Jane likes Soda
Jane likes Cake
Jim likes Steak
我想要的输出是一个字典,其名称为关键字,"喜欢"作为相应值的列表:
{'John':('Steak', 'Soda', 'Cake'), 'Jane':('Soda', 'Cake'), 'Jim':('Steak')}
我继续遇到将我的剥离词添加到我的列表中的错误,并尝试了几种不同的方法:
pred = ()
prey = ()
spacedLine = inf.readline()
line = spacedLine.rstrip('\n')
while line!= "":
line = line.split()
pred.append = (line[0])
prey.append = (line[2])
spacedLine = inf.readline()
line = spacedLine.rstrip('\n')
还有:
spacedLine = inf.readline()
line = spacedLine.rstrip('\n')
while line!= "":
line = line.split()
if line[0] in chain:
chain[line[0] = [0, line[2]]
else:
chain[line[0]] = line[2]
spacedLine = inf.readline()
line = spacedLine.rstrip('\n')
任何想法?
答案 0 :(得分:2)
这样做(不需要先将整个文件读入内存):
likes = {}
for who, _, what in (line.split()
for line in (line.strip()
for line in open('likes.txt', 'rt'))):
likes.setdefault(who, []).append(what)
print(likes)
输出:
{'Jane': ['Soda', 'Cake'], 'John': ['Steak', 'Soda', 'Cake'], 'Jim': ['Steak']}
或者,为了略微简化,您可以使用临时collections.defaultdict
:
from collections import defaultdict
likes = defaultdict(list)
for who, _, what in (line.split()
for line in (line.strip()
for line in open('likes.txt', 'rt'))):
likes[who].append(what)
print(dict(likes)) # convert to plain dictionary and print
答案 1 :(得分:1)
您的输入是一系列序列。首先解析外部序列,然后解析每个项目。
你的外部序列是:
Statement
<empty line>
Statement
<empty line>
...
假设f
是包含数据的打开文件。阅读每个语句并返回它们的列表:
def parseLines(f):
result = []
for line in f: # file objects iterate over text lines
if line: # line is non-empty
result.append(line)
return result
请注意,上面的函数接受更广泛的语法:它允许非空行之间任意多个空行,并允许行中的两个非空行。但它确实接受任何正确的输入。
然后,您的陈述是三重:X likes Y
。通过用空格拆分并检查结构来解析它。结果是一对正确的(x, y)
。
def parseStatement(s):
parts = s.split() # by default, it splits by all whitespace
assert len(parts) == 3, "Syntax error: %r is not three words" % s
x, likes, y = parts # unpack the list of 3 items into varaibles
assert likes == "likes", "Syntax error: %r instead of 'likes'" % likes
return x, y
为每个陈述制作成对列表:
pairs = [parseStatement(s) for s in parseLines(f)]
现在您需要按键对值进行分组。让我们使用defaultdict
为任何新密钥提供默认值:
from collections import defaultdict
the_answer = defaultdict(list) # the default value is an empty list
for key, value in pairs:
the_answer[key].append(value)
# we can append because the_answer[key] is set to an empty list on first access
所以这里the_answer
就是你需要的,只有它使用list作为dict值而不是元组。这一定足以让你理解你的作业。
答案 2 :(得分:0)
dic={}
for i in f.readlines():
if i:
if i.split()[0] in dic.keys():
dic[i.split()[0]].append(i.split()[2])
else:
dic[i.split()[0]]=[i.split()[2]]
print dic
这应该这样做。
这里我们迭代通过作为文件对象的f.readlines
f
,并在每一行上填充字典,使用split的第一部分作为键,将split的最后部分作为值