来自文件我有类似以下字符串的内容:
var1 : data1
var2 : data2
dict1 {
var3 : data3
dict2 {
var4 : data4
}
var5 : data5
}
dict3 {
var6 : data6
var7 : data7
}
等等。 (行尾是\ n,每个缩进都是\ t) 我尝试将其转换为类似的东西:
Dictionary={"var1":"data1","var2":"data2", "dict1" :
{"var3":"data3", "dict2" : {
"var4":"data4" }, "var5":"data5"}
, dict3:{"var6":"data6","var7":"data7"}
(缩进只是保持它在某种程度上是人类可读的)
要解决它,我能想到的就是将它拆分成一个列表,然后向下走列表,直到我在字符串中找到一个“}”,删除它(所以我以后不会再遇到它),然后走直到我找到带有“{”的字符串,删除之前的空格和之后的“{”(现在使用temp=re.split ('(\S+) \{',out[z])
,对于此示例,第一个temp [1]将是“dict2”),在其间添加所有内容,最后转到下一个“}”。
但那并不快或优雅。我肯定错过了一些东西 代码目前:
def procvar(strinG):
x=y=z=temp1=temp2=0
back = False
out=re.split ('\n',strinG) #left over from some other tries
while z < len(out):
print "z=",z," out[z]= ", out[z]
if "{" in out[z]:
if back == True:
back = False
xtemp=re.split ('(\S+) \{',out[z])
out[z]=xtemp[1]
ytemp=xtemp[1]
temp2=z+1
print "Temp: ",temp1," - ",out[temp1]
out[z]={out[z]:[]}
while temp2 <= temp1:
out[z][xtemp[1]].append(out[temp2]) # not finished here, for the time being I insert the strings as they are
del out[temp2]
temp1-=1
print out[z]
if "}" in out[z]:
back = True
del out[z]
temp1 = z-1
if back == True:
z-=1
else:
z+=1
return out
答案 0 :(得分:2)
你的格式足够接近yaml one(easy_install pyyaml): http://pyyaml.org/wiki/PyYAML
x = """var1 : data1
var2 : data2
dict1 {
var3 : data3
dict2 {
var4 : data4
}
var5 : data5
}
dict3 {
var6 : data6
var7 : data7
}"""
x2 = x.replace('{', ':').replace('}','')
yaml.load(x2)
{'dict1': {'dict2': {'var4': 'data4'}, 'var3': 'data3', 'var5': 'data5'},
'dict3': {'var6': 'data6', 'var7': 'data7'},
'var1': 'data1',
'var2': 'data2'}
答案 1 :(得分:0)
import re
# key : value regexp
KV_RE = re.compile(r'^\s*(?P<key>[^\s]+)\s+:\s+(?P<value>.+?)\s*$')
# dict start regexp
DS_RE = re.compile(r'^\s*(?P<key>[^\s]+)\s+{\s*$')
# dict end regexp
DE_RE = re.compile(r'^\s*}\s*$')
def parse(s):
current = {}
stack = []
for line in s.strip().splitlines():
match = KV_RE.match(line)
if match:
gd = match.groupdict()
current[gd['key']] = gd['value']
continue
match = DS_RE.match(line)
if match:
stack.append(current)
current = current.setdefault(match.groupdict()['key'], {})
continue
match = DE_RE.match(line)
if match:
current = stack.pop()
continue
# Error occured
print('Error: %s' % line)
return {}
return current
答案 2 :(得分:0)
如果您的文本与示例处于相同的常规模式,则可以使用ast.literal_eval来解析字符串。
首先,让我们将字符串修改为合法的Python字典文本:
import re
st='''\
var1 : data1
var2 : data2
dict1 {
var3 : data3
dict2 {
var4 : data4
}
var5 : data5
}
'''
# add commas after key, val pairs
st=re.sub(r'^(\s*\w+\s*:\s*\w+)\s*$',r'\1,',st,flags=re.M)
# insert colon after name and before opening brace
st=re.sub(r'^\s*(\w+\s*){\s*$',r'\1:{',st,flags=re.M)
# add comma closing brace
st=re.sub(r'^(\s*})\s*$',r'\1,',st,flags=re.M)
# put names into quotes
st=''.join(['"{}"'.format(s.group(0)) if re.search(r'\w+',s.group(0)) else s.group(0)
for s in re.finditer(r'\w+|\W+',st)])
# add opening and closing braces
st='{'+st+'}'
print st
打印修改后的字符串:
{"var1" : "data1",
"var2" : "data2",
"dict1" :{
"var3" : "data3",
"dict2" :{
"var4" : "data4",
},
"var5" : "data5",
},}
现在使用ast将字符串转换为数据结构:
import ast
print ast.literal_eval(st)
打印
{'dict1': {'var5': 'data5', 'var3': 'data3', 'dict2': {'var4': 'data4'}}, 'var1': 'data1', 'var2': 'data2'}