我有一个看起来像这样的分块数据:
>Head1
foo 0 1.10699e-05 2.73049e-05
bar 0.939121 0.0173732 0.0119144
qux 0 2.34787e-05 0.0136463
>Head2
foo 0 0.00118929 0.00136993
bar 0.0610655 0.980495 0.997179
qux 0.060879 0.982591 0.974276
每个块都是白色空间分隔的。 我想要做的是将它们转换为嵌套字典,如下所示:
{ 'Head1': {'foo': '0 1.10699e-05 2.73049e-05',
'bar': '0.939121 0.0173732 0.0119144',
'qux': '0 2.34787e-05 0.0136463'},
'Head2': {'foo': '0 0.00118929 0.00136993',
'bar': '0.0610655 0.980495 0.997179',
'qux': '0.060879 0.982591 0.974276'}
}
在Python中使用它的方法是什么? 我不确定怎么离开这里:
def parse():
caprout="tmp.txt"
with open(caprout, 'r') as file:
datalines = (ln.strip() for ln in file)
for line in datalines:
if line.startswith(">Head"):
print line
elif not line.strip():
print line
else:
print line
return
def main()
parse()
return
if __name__ == '__main__'
parse()
答案 0 :(得分:1)
这是我能想到的最简单的解决方案:
mainDict = dict()
file = open(filename, 'r')
for line in file:
line = line.strip()
if line == "" :
continue
if line.find("Head") :
lastBlock = line
mainDict[lastBlock] = dict()
continue
splitLine = line.partition(" ")
mainDict[lastBlock][splitLine[0]] = splitLine[2]
答案 1 :(得分:1)
文件:
[sgeorge@sgeorge-ld1 tmp]$ cat tmp.txt
>Head1
foo 0 1.10699e-05 2.73049e-05
bar 0.939121 0.0173732 0.0119144
qux 0 2.34787e-05 0.0136463
>Head2
foo 0 0.00118929 0.00136993
bar 0.0610655 0.980495 0.997179
qux 0.060879 0.982591 0.974276
脚本:
[sgeorge@sgeorge-ld1 tmp]$ cat a.py
import json
dict_ = {}
def parse():
caprout="tmp.txt"
with open(caprout, 'r') as file:
datalines = (ln.strip() for ln in file)
for line in datalines:
if line != '':
if line.startswith(">Head"):
key = line.replace('>','')
dict_[key] = {}
else:
nested_key = line.split(' ',1)[0]
value = line.split(' ',1)[1]
dict_[key][nested_key] = value
print json.dumps(dict_)
parse()
执行:
[sgeorge@sgeorge-ld1 tmp]$ python a.py | python -m json.tool
{
"Head1": {
"bar": "0.939121 0.0173732 0.0119144",
"foo": "0 1.10699e-05 2.73049e-05",
"qux": "0 2.34787e-05 0.0136463"
},
"Head2": {
"bar": "0.0610655 0.980495 0.997179",
"foo": "0 0.00118929 0.00136993",
"qux": "0.060879 0.982591 0.974276"
}
}