我有一个包含这样数据的文件。 '>'作为标识符。
>test1
this is line 1
hi there
>test2
this is line 3
how are you
>test3
this is line 5 and
who are you
我正在尝试创建字典
{'>test1':'this is line 1hi there','>test2':'this is line 3how are you','>test3':'this is line 5who are you'}
我已导入该文件,但我无法以这种方式执行此操作。我想删除每行末尾的换行符,以便得到一行。看不到所需的空间。 任何帮助将不胜感激
这是我到目前为止所尝试的
new_dict = {}
>>> db = open("/home/ak/Desktop/python_files/smalltext.txt")
for line in db:
if '>' in line:
new_dict[line]=''
else:
new_dict[line]=new_dict[line].append(line)
答案 0 :(得分:3)
使用您的方法将是:
new_dict = {}
>>> db = open("/home/ak/Desktop/python_files/smalltext.txt", 'r')
for line in db:
if '>' in line:
key = line.strip() #Strips the newline characters
new_dict[key]=''
else:
new_dict[key] += line.strip()
答案 1 :(得分:1)
以下是使用groupby的解决方案:
from itertools import groupby
kvs=[]
with open(f_name) as f:
for k, v in groupby((e.rstrip() for e in f), lambda s: s.startswith('>')):
kvs.append(''.join(v) if k else '\n'.join(v))
print {k:v for k,v in zip(kvs[0::2], kvs[1::2])}
字典:
{'>test1': 'this is line 1\n\nhi there',
'>test2': 'this is line 3\n\nhow are you',
'>test3': 'this is line 5 and\n\nwho are you'}
答案 2 :(得分:0)
您可以使用正则表达式:
import re
di={}
pat=re.compile(r'^(>.*?)$(.*?)(?=^>|\Z)', re.S | re.M)
with open(fn) as f:
txt=f.read()
for k, v in ((m.group(1), m.group(2)) for m in pat.finditer(txt)):
di[k]=v.strip()
print di
# {'>test1': 'this is line 1\nhi there', '>test2': 'this is line 3\nhow are you', '>test3': 'this is line 5 and\nwho are you'}