我正在尝试解析以this逗号分隔的内容,其中列出了HiPS的一些调查。这是我的代码:
with open('surveys.txt') as data:
text = data.read()
surveys = OrderedDict()
for survey in text.split('\n'):
for line in survey.split('\n'):
# Skip empty or comment lines
if line == '' or line.startswith('#'):
continue
try:
key, value = [_.strip() for _ in line.split('=')]
surveys[key] = value
except ValueError:
continue
给我一个OrderedDict
,其中包含上次调查的元素。这是因为在每次循环迭代后,surveys
的内容被覆盖。
我尝试通过为每个新OrderedDict
创建survey
并使用以下代码将其附加到list
来解决此问题:
with open('surveys.txt') as data:
text = data.read()
surveys = []
for survey in text.split('\n'):
survey_OD = OrderedDict()
for line in survey.split('\n'):
# Skip empty or comment lines
if line == '' or line.startswith('#'):
continue
try:
key, value = [_.strip() for _ in line.split('=')]
survey_OD[key] = value
except ValueError:
continue
surveys.append(survey_OD)
但是这会为每个逗号分隔值创建一个单独的OrderedDict
,如下所示:
OrderedDict([('hips_order', '3')]) OrderedDict([('hips_frame', 'galactic')]) OrderedDict([('hips_tile_format', 'jpeg fits')])
什么时候,我期待这样的事情:
OrderedDict([('hips_order', '3'), ('hips_frame', 'galactic'), ('hips_tile_format', 'jpeg fits')])
答案 0 :(得分:1)
让我们稍微改进你的第一种方法:
所以我们可以写
from collections import OrderedDict
with open('surveys.txt') as data:
text = data.read()
surveys = list()
for raw_survey in text.split('\n\n'):
survey = OrderedDict()
# Skip empty lines
for line in filter(None, raw_survey.split('\n')):
# Skip comment lines
if line.startswith('#'):
continue
try:
key, value = [_.strip() for _ in line.split('=')]
survey[key] = value
except ValueError:
continue
surveys.append(survey)
会给我们
>>>surveys[0]
OrderedDict([('ID', 'CDS/C/MUSE-M42'), ('creator_did', 'ivo://CDS/C/MUSE-M42'),
('obs_collection', 'MUSE-M42'),
('obs_title', 'MUSE map of the central Orion Nebula (M 42)'), (
'obs_description',
'Integral-field spectroscopic dataset of the central part of the Orion Nebula (M 42), observed with the MUSE instrument at the ESO VLT (reduced the data with the public MUSE pipeline) representing a FITS cube with a spatial size of ~5.9\'x4.9\' (corresponding to ~0.76 pc x 0.63 pc) and a contiguous wavelength coverage of 4595...9366 Angstrom, spatially sampled at 0.2", with a sampling of 1.25 Angstrom in dispersion direction.'),
('obs_ack', 'Based on data obtained from the ESO/VLT'),
('prov_progenitor', 'MUSE Consortium'),
('bib_reference', '2015A&A...582A.114W'),
('obs_copyright', 'Copyright mention of the original data'),
('obs_copyright_url', 'http://muse-vlt.eu/science/m42/'),
('hips_release_date', '2015-07-07T00:29Z'),
('hips_builder', 'Aladin/HipsGen v9.505'), ('hips_order', '12'),
('hips_pixel_cut', '0 7760'), ('hips_tile_format', 'png fits'),
('hips_cube_depth', '3818'), ('hips_cube_firstframe', '1909'),
('hips_frame', 'equatorial'), ('dataproduct_type', 'image'),
('t_min', '56693'), ('t_max', '56704'), ('em_min', '4,595e-7'),
('em_max', '9,366e-7'), ('hips_version', '1.31'),
('hips_creation_date', '03/07/15 12:00:30'),
('hips_creator', 'CDS (P.Fernique)'), ('hips_tile_width', '512'),
('hips_status', 'public master clonableOnce'),
('hips_pixel_bitpix', '-32'), ('data_pixel_bitpix', '-32'),
('hips_hierarchy', 'mean'), ('hips_initial_ra', '83.82094'),
('hips_initial_dec', '-5.39542'), ('hips_initial_fov', '0.09811'),
('hips_pixel_scale', '2.795E-5'), ('s_pixel_scale', '5.555E-5'),
('moc_sky_fraction', '2.980E-7'), ('hips_estsize', '87653'),
('data_bunit', '10**(-20)*erg/s/cm**2/Angstrom'),
('data_cube_crpix3', '1'), ('data_cube_crval3', '4595'),
('data_cube_cdelt3', '1.25'), ('data_cube_bunit3', 'Angstrom'),
('client_application', 'AladinDesktop'),
('hips_copyright', 'CNRS/Unistra'), ('obs_regime', 'Optical'),
('hips_service_url', 'http://alasky.unistra.fr/MUSE/MUSE-M42'), (
'hips_service_url_1',
'http://alaskybis.unistra.fr/MUSE/MUSE-M42'),
('hips_status_1', 'public mirror clonable'), (
'hips_service_url_2',
'https://alaskybis.unistra.fr/MUSE/MUSE-M42'),
('hips_status_2', 'public mirror clonable'), ('moc_order', '12'),
('obs_initial_ra', '83.82094'), ('obs_initial_dec', '-5.39542'),
('obs_initial_fov', '0.014314526715905856'),
('TIMESTAMP', '1490387811000')])