专家,
我编写了一个将字符串转换为字典的程序。我能够达到预期的效果,但我怀疑这是否是一种pythonic方式。想听听同样的建议。
txt = '''
name : xxxx
desgination : yyyy
cities :
LA : Los Angeles
NY : New York
HeadQuarters :
LA : LA
NY : NY
Country : USA
'''
我已经拆分使用(:)并存储在字典中。 这里的Cities和HeadQuarters包含另一个字典,我已经编写了这样的代码。
if k == 'cities' :
D[k] = {}
continue
elif k == 'HeadQuarters':
D[k] = {}
continue
elif k == 'LA' :
if D.has_key('cities'):
if D['cities'].get(k) is None:
D['cities'][k] = v
if D.has_key('HeadQuarters'):
if D['HeadQuarters'].get(k) is None:
D['HeadQuarters'][k] = v
elif k == 'NY' :
if D.has_key('cities'):
if D['cities'].get(k) is None:
D['cities'][k] = v
if D.has_key('HeadQuarters'):
if D['HeadQuarters'].get(k) is None:
D['HeadQuarters'][k] = v
else:
D[k]= v
答案 0 :(得分:1)
不确定是否是pythonic
x = re.split(r':|\n',txt)[1:-1]
x = list(map(lambda x: x.rstrip(),x))
x = (zip(x[::2], x[1::2]))
d = {}
for i in range(len(x)):
if not x[i][0].startswith(' '):
if x[i][1] != '':
d[x[i][0]] = x[i][1]
else:
t = x[i][0]
tmp = {}
i+=1
while x[i][0].startswith(' '):
tmp[x[i][0].strip()] = x[i][1]
i+=1
d[t] = tmp
print d
输出
{'Country': ' USA', 'cities': {'NY': ' New York', 'LA': ' Los Angeles'}, 'name': ' xxxx', 'desgination': ' yyyy', 'HeadQuarters': {'NY': ' NY', 'LA': ' LA'}}
答案 1 :(得分:1)
您可以在此处使用split
方法,对子词典进行一点递归,并假设您的子词典以制表符(\t
)或四个空格开头:
def txt_to_dict(txt):
data = {}
lines = txt.split('\n')
i = 0
while i < len(lines):
try:
key,val = txt.split(':')
except ValueError:
# print "Invalid row format"
i += 1
continue
key = key.strip()
val = val.strip()
if len(val) == 0:
i += 1
sub = ""
while lines[i].startswith('\t') or lines[i].startswith(' '):
sub += lines[i] + '\n'
i += 1
data[key] = txt_to_dict(sub[:-1]) # remove last newline character
else:
data[key] = val
i += 1
return data
然后你只需在变量txt
上将其称为:
>>> print txt_to_dict(txt)
{'Country': 'USA', 'cities': {'NY': 'New York', 'LA': 'Los Angeles'}, 'name': 'xxxx', 'desgination': 'yyyy', 'HeadQuarters': {'NY': 'NY', 'LA': 'LA'}}
上面显示的示例输出。正确创建子词典。
添加了一些错误处理。
答案 2 :(得分:1)
这会产生与代码相同的输出。它主要是通过重构你所拥有的并使用一些常见的Python习语来实现的。
txt = '''
name : xxxx
desgination : yyyy
cities :
LA : Los Angeles
NY : New York
HeadQuarters :
LA : LA
NY : NY
Country : USA
'''
D = {} # added to test code
for line in (line for line in txt.splitlines() if line): # "
k, _, v = [s.strip() for s in line.partition(':')] # "
if k in {'cities', 'HeadQuarters'}:
D[k] = {}
continue
elif k in {'LA', 'NY'}:
for k2 in (x for x in ('cities', 'HeadQuarters') if x in D):
if k not in D[k2]:
D[k2][k] = v
else:
D[k]= v
import pprint
pprint.pprint(D)
输出:
{'Country': 'USA',
'HeadQuarters': {'LA': 'LA', 'NY': 'NY'},
'cities': {'LA': 'Los Angeles', 'NY': 'New York'},
'desgination': 'yyyy',
'name': 'xxxx'}
答案 3 :(得分:1)
您可以使用现有的yaml解析器(PyYAML
package):
import yaml # $ pip install pyyaml
data = yaml.safe_load(txt)
{'Country': 'USA',
'HeadQuarters': {'LA': 'LA', 'NY': 'NY'},
'cities': {'LA': 'Los Angeles', 'NY': 'New York'},
'desgination': 'yyyy',
'name': 'xxxx'}
解析器按原样接受您的输入,但为了使其更符合yaml
,它需要small modifications:
---
Country: USA
HeadQuarters:
LA: LA
NY: NY
cities:
LA: "Los Angeles"
NY: "New York"
desgination: yyyy
name: xxxx
答案 4 :(得分:0)
这有效
txt = '''
name : xxxx
desgination : yyyy
cities :
LA : Los Angeles
NY : New York
HeadQuarters :
LA : LA
NY : NY
Country : USA
'''
di = {}
for line in txt.split('\n'):
if len(line)> 1: di[line.split(':')[0].strip()]= line.split(':')[1].strip()
print di # {'name': 'xxxx', 'desgination': 'yyyy', 'LA': 'LA', 'Country': 'USA', 'HeadQuarters': '', 'NY': 'NY', 'cities': ''}