我有一个看起来像这样的文件:
Mother Jane
Father Bob
Friends Ricky,Jack,Brian,Jordan, \
Ricardo,Sonia,Blake
如您所见,我在“朋友”第一行的末尾有一个换行符。当我想将此文件解析为字典时,当前代码给我一个错误。
我在网上寻找解决方案,并尝试了多种方法,但似乎无济于事。
with open('./file.txt') as f:
content = f.readlines()
dic = {}
for line in content:
line_items = line.strip().split()
if len(line_items) <= 2:
dic[line_items[0]] = line_items[1]
else:
dic[line_items[0]] = line_items[1:]
我想要一个看起来像这样的结果:
dict = {"Mother": "Jane", "Father": "Bob","Friends":[Ricky,Jack,Brian,Jordan,Ricardo,Sonia,Blake]
但是我却遇到索引错误。
答案 0 :(得分:2)
以下似乎有效。它将多行收集到一条逻辑行中,然后进行处理。它还不会将整个文件读入内存。
from pprint import pprint, pformat
dic = {}
with open('./newline_file.txt') as f:
lst = []
for line in iter(f.readline, ''):
line = line.strip()
if line[-1] == '\\': # Ends with backslash?
lst.append(line[:-2])
continue
else:
lst.append(line)
logical_line = ''.join(lst)
lst = []
line_items = logical_line.split(' ')
if len(line_items) == 2:
if ',' in line_items[1]:
dic[line_items[0]] = line_items[1].split(',')
else:
dic[line_items[0]] = line_items[1]
pprint(dic)
输出:
{'Father': 'Bob',
'Friends': ['Ricky', 'Jack', 'Brian', 'Jordan', 'Ricardo', 'Sonia', 'Blake'],
'Mother': 'Jane'}
答案 1 :(得分:0)
您可以使用类似的内容:
import re
with open('file.txt') as f:
c = f.read().strip()
#cleanup line breaks where comma is the last printable character
c = re.sub(r",\s+", ",", c)
final_dict = {}
for l in c.split("\n"):
k,v = l.split()
if "," in v:
final_dict[k] = [x for x in v.split(",")]
else:
final_dict[k] = v
print(final_dict)
输出:
{'Mother': 'Jane', 'Father': 'Bob', 'Friends': ['Ricky', 'Jack', 'Brian', 'Jordan', 'Ricardo', 'Sonia', 'Blake']}
答案 2 :(得分:0)
您可以累积带有连续反斜杠的行,并且仅在完成后才处理行:
dic = {}
continued = ""
for line in content:
if "\\" in line:
continued += line.split("\\")[0]
continue
key,value = (continued+line+" ").split(" ",1)
continued = ""
value = [v.strip() for v in value.strip().split(",") if v != ""]
dic[key] = value[0] if len(value)==1 else value
print(dic) # {'Mother': 'Jane', 'Father': 'Bob', 'Friends': ['Ricky', 'Jack', 'Brian', 'Jordan', 'Ricardo', 'Sonia', 'Blake']}