如何解析此文本文件:
mapping apple
v1: v1a : v1b
mapping ball
v2: v2a : v2b
获取
{'apple':['v1','v1a','v1b'], 'ball':['v2','v2a','v2b']}
在单一映射下可以有多个V,如:
mapping apple
v1: v1a : v1b
v2: v2a : v2b
v3: v3a : v3b
mapping ball
v1: v1a : v1b
v2: v2a : v2b
这是我到目前为止所尝试的:
copy=False
for line in fh:
if line.strip()=="mapping_start":
copy=True
elif line.strip()=="mapping_end":
copy=False
elif copy:
if line.find('#')==-1 and len(line.strip())>0:
#make a dictionary here
答案 0 :(得分:1)
您可以使用this video,这不需要识别开始和结束。您只需要识别'映射'行和包含值的所有其他行:
from collections import defaultdict
fH = """mapping apple
v1: v1a : v1b
mapping ball
v2: v2a : v2b"""
result = defaultdict(list)
for line in fH.splitlines():
if 'mapping' in line:
key = line.split()[1]
else:
for values in line.split(':'):
result[key].append(values.strip())
print(result)
返回:
defaultdict(<class 'list'>, {'apple': ['v1', 'v1a', 'v1b'], 'ball': ['v2', 'v2a', 'v2b']})
答案 1 :(得分:1)
with open("lol.txt", 'r') as config:
adict = {}
for line in config.readlines():
if 'mapping' in line:
key = line.strip().split()[-1]
else:
line = line.replace(' ', '').strip()
adict[key] = line.split(':')
答案 2 :(得分:1)
您可以使用re
以及iter()
和next()
功能的组合来逃避不必要的检查:
import re
input_data = '''
mapping apple
v1: v1a : v1b
mapping ball
v2: v2a : v2b
'''
# convert input to list
input_data = input_data.strip().split('\n')
# create iterator
iterate_over = iter(input_data)
# declare output dictionary
output = {}
# start iteration
for line in iterate_over:
match = re.findall(r'(?<=^mapping\s)\w+$', line)
if match:
try:
output.update({match[0]: re.sub(r'\s+', '', next(iterate_over)).split(':')})
except StopIteration:
break
print(output)