附加读取文本文件python3中的列表

时间:2014-12-05 17:34:31

标签: python list dictionary

我正在尝试读取txt文件并从文本中创建字典。示例txt文件是:

John likes Steak

John likes Soda

John likes Cake

Jane likes Soda

Jane likes Cake

Jim likes Steak

我想要的输出是一个字典,其名称为关键字,"喜欢"作为相应值的列表:

{'John':('Steak', 'Soda', 'Cake'), 'Jane':('Soda', 'Cake'), 'Jim':('Steak')}

我继续遇到将我的剥离词添加到我的列表中的错误,并尝试了几种不同的方法:

pred = ()

prey = ()

spacedLine = inf.readline()

line = spacedLine.rstrip('\n')

while line!= "":

    line = line.split()
    pred.append = (line[0])
    prey.append = (line[2])
    spacedLine = inf.readline()
    line = spacedLine.rstrip('\n')

还有:

spacedLine = inf.readline()

line = spacedLine.rstrip('\n')

while line!= "":

     line = line.split()      
     if line[0] in chain:
       chain[line[0] = [0, line[2]]
      else:
        chain[line[0]] = line[2]
    spacedLine = inf.readline()
    line = spacedLine.rstrip('\n')

任何想法?

3 个答案:

答案 0 :(得分:2)

这样做(不需要先将整个文件读入内存):

likes = {}
for who, _, what in (line.split()
                        for line in (line.strip()
                            for line in open('likes.txt', 'rt'))):
    likes.setdefault(who, []).append(what)

print(likes)

输出:

{'Jane': ['Soda', 'Cake'], 'John': ['Steak', 'Soda', 'Cake'], 'Jim': ['Steak']}

或者,为了略微简化,您可以使用临时collections.defaultdict

from collections import defaultdict

likes = defaultdict(list)
for who, _, what in (line.split()
                        for line in (line.strip()
                            for line in open('likes.txt', 'rt'))):
    likes[who].append(what)

print(dict(likes))  # convert to plain dictionary and print

答案 1 :(得分:1)

您的输入是一系列序列。首先解析外部序列,然后解析每个项目。

你的外部序列是:

Statement
<empty line>
Statement
<empty line>
...

假设f是包含数据的打开文件。阅读每个语句并返回它们的列表:

def parseLines(f):
  result = []
  for line in f:  # file objects iterate over text lines
    if line:  # line is non-empty
      result.append(line)
  return result

请注意,上面的函数接受更广泛的语法:它允许非空行之间任意多个空行,并允许行中的两个非空行。但它确实接受任何正确的输入。

然后,您的陈述是三重:X likes Y。通过用空格拆分并检查结构来解析它。结果是一对正确的(x, y)

def parseStatement(s):
  parts = s.split()  # by default, it splits by all whitespace
  assert len(parts) == 3, "Syntax error: %r is not three words" % s
  x, likes, y = parts  # unpack the list of 3 items into varaibles
  assert likes == "likes", "Syntax error: %r instead of 'likes'" % likes
  return x, y

为每个陈述制作成对列表:

pairs = [parseStatement(s) for s in parseLines(f)]

现在您需要按键对值进行分组。让我们使用defaultdict为任何新密钥提供默认值:

from collections import defaultdict

the_answer = defaultdict(list)  # the default value is an empty list

for key, value in pairs:
  the_answer[key].append(value) 
  # we can append because the_answer[key] is set to an empty list on first access

所以这里the_answer就是你需要的,只有它使用list作为dict值而不是元组。这一定足以让你理解你的作业。

答案 2 :(得分:0)

dic={}

for i in f.readlines():
    if i:
        if i.split()[0] in dic.keys():
            dic[i.split()[0]].append(i.split()[2])
        else:
            dic[i.split()[0]]=[i.split()[2]]

print dic

这应该这样做。

这里我们迭代通过作为文件对象的f.readlines f,并在每一行上填充字典,使用split的第一部分作为键,将split的最后部分作为值