我想知道如何在python中将文本提取到字典中。文本文件的格式如下(见下文)并以方式提取,以便对象地球例如是键,其半径,周期和所有都在其键内。
RootObject: Sun
Object: Sun
Satellites: Mercury,Venus,Earth,Mars,Jupiter,Saturn,Uranus,Neptune,Ceres,Pluto,Haumea,Makemake,Eris
Radius: 20890260
Orbital Radius: 0
Object: Earth
Orbital Radius: 77098290
Period: 365.256363004
Radius: 6371000.0
Satellites: Moon
Object: Moon
Orbital Radius: 18128500
Radius: 1737000.10
Period: 27.321582
答案 0 :(得分:3)
nk="""
RootObject: Sun
Object: Sun
Satellites: Mercury,Venus,Earth,Mars,Jupiter,Saturn,Uranus,Neptune,Ceres,Pluto,Haumea,Makemake,Eris
Radius: 20890260
Orbital Radius: 0
Object: Earth
Orbital Radius: 77098290
Period: 365.256363004
Radius: 6371000.0
Satellites: Moon
Object: Moon
Orbital Radius: 18128500
Radius: 1737000.10
Period: 27.321582
"""
my_test_dict={}
for x in nk.splitlines():
if ':' in x:
if x.split(':')[0].strip()=='RootObject':
root_obj=x.split(':')[1].strip()
elif x.split(':')[0].strip()=='Object':
my_test_dict[x.split(':')[1].strip()]={}
current_dict=x.split(':')[1].strip()
if x.split(':')[1].strip()!=root_obj:
for x1 in my_test_dict:
if 'Satellites' in my_test_dict[x1]:
if x.split(':')[1].strip() in my_test_dict[x1]['Satellites'].split(','):
my_test_dict[x.split(':')[1].strip()]['RootObject']=x1
else:
my_test_dict[current_dict][x.split(':')[0].strip()]=x.split(':')[1].strip()
print my_test_dict
输出:
{
'Sun':
{
'Satellites': 'Mercury,Venus,Earth,Mars,Jupiter,Saturn,Uranus,Neptune,Ceres,Pluto,Haumea,Makemake,Eris',
'Orbital Radius': '0',
'Radius': '20890260'
},
'Moon':
{
'Orbital Radius': '18128500',
'Radius': '1737000.10',
'Period': '27.321582',
'RootObject': 'Earth'
},
'Earth':
{
'Satellites': 'Moon',
'Orbital Radius': '77098290',
'Radius': '6371000.0',
'Period': '365.256363004',
'RootObject': 'Sun'
}
}
答案 1 :(得分:2)
使用上述其中一项的修改,您将获得以下内容:
def read_next_object(file):
obj = {}
for line in file:
if not line.strip(): continue
line = line.strip()
key, val = line.split(": ")
if key in obj and key == "Object":
yield obj
obj = {}
obj[key] = val
yield obj
planets = {}
with open( "test.txt", 'r') as f:
for obj in read_next_object(f):
planets[obj["Object"]] = obj
print planets
修复RootObject
的大小写,我相信这是您要发布的示例数据中的最终字典。它是一个行星词典,每个行星都是它的信息字典。
print planets["Sun"]["Radius"]
应打印值20890260
以上输出如下所示:
{ 'Earth': { 'Object': 'Earth',
'Orbital Radius': '77098290',
'Period': '365.256363004',
'Radius': '6371000.0',
'Satellites': 'Moon'},
'Moon': { 'Object': 'Moon',
'Orbital Radius': '18128500',
'Period': '27.321582',
'Radius': '1737000.10'},
'Sun': { 'Object': 'Sun',
'Orbital Radius': '0',
'Radius': '20890260',
'RootObject': 'Sun',
'Satellites': 'Mercury,Venus,Earth,Mars,Jupiter,Saturn,Uranus,Neptune,Ceres,Pluto,Haumea,Makemake,Eris'}}
答案 2 :(得分:0)
假设您希望以逗号分隔值的元素作为列表,请尝试:
mydict={}
with open(my_file,'r') as the_file:
for line in the_file:
if not line.strip(): continue # skip blank lines
key,val=line.split(": ")
val = val.split(",")
mydict[key] = val if len(val) > 1 else val[0]