如何阅读txt/data
文件,其中有许多列未按行分隔(请参阅下面的文件)。
我想只从文件的每一行中提取一些所需的参数。
文件内容:
version=2
id NumCompo Species QuantumNumbers Frequency Eup Gup Aij FitFreq DeltaFitFreq Vo deltaVo FWHM_G deltaFWHM_G FWHM_L deltaFWHM_L Intensity deltaIntensity FitFlux deltaFitFlux Freq.IntensityMax V.IntensityMax FWHM IntensityMax Flux1stMom deltaFlux1stMom rms deltaV Cal Size TelescopePath TelescopeName
None None None None MHz K None s-1 MHz MHz km/s km/s km/s km/s km/s km/s K K K.km/s K.km/s MHz km/s km/s K K.km/s K.km/s mK km/s % arcsec None None
44003 1 CH3CHO (18 1 17 2 _ 17 1 16 2) 350362.8435 163.4598498101699 74 0.0014741376966675667 350355.5848769065 Infinity 6.210933891166498 Infinity 1.739817511288065 Infinity 0.0 0.0 2.8623661141900496 Infinity 5.301075803032265 0.0 350355.5 6.2835599041722 1.7199802504879502 2.848570585251 4.899485148752622 0.0 3.854414571567953E-5 0.8289084198960507 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (18 1 17 0 _ 17 1 16 0) 350445.7777 163.41850853869101 74 0.0014742735891251069 350437.70831029414 0.12591692133973328 6.903042719892636 0.10771941048880102 2.203766561226652 0.2947187307742186 0.0 0.0 3.482121868378891 0.34484851265307565 8.168010359688692 0.0 350437.5 7.08124391130517 2.11135392209597 3.597269296646 7.595108638308943 0.0 204.05560763773454 0.8287143946045267 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (13 2 12 1 _ 12 1 11 2) 350572.1804 93.02188980281947 54 9.699686188970169E-5 350566.94541642064 NaN 4.476706032550229 NaN 0.4589727274204179 NaN 0.0 0.0 23.273694629520087 NaN 11.372220042377085 0.0 350566.40625 4.9377752090695495 1.3912897271407283 1.418276190758 1.9732330944498893 0.0 425.46913502384274 0.8284143332887641 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (8 6 3 3 _ 9 5 4 3) 350808.1122 318.0348265703963 34 1.075967688918428E-5 350801.2794264813 Infinity 5.839129418307394 Infinity 0.565736741450577 Infinity 0.0 0.0 11.264418715889377 Infinity 6.784303066688616 0.0 350801.75 5.436988227894669 1.3717772790578981 2.775228977203 3.806996055110165 0.0 156.0928146251678 0.8412065022978162 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (8 6 2 3 _ 9 5 5 3) 350808.1275 318.03482730468073 34 1.0759677371497151E-5 350801.2795084328 Infinity 5.852134153594954 Infinity 0.5663191176333013 Infinity 0.0 0.0 11.228477212030814 Infinity 6.769618049955546 0.0 350801.75 5.450063014554457 1.3717772790578981 2.775228977203 3.806996055110165 0.0 156.8862242548636 0.8412065022978162 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
答案 0 :(得分:0)
如果scipy.optimize.minimize
列中也不包含空格,您可以只是在空格中分隔行。
为此,您可以使用a regex to replace all spaces withing brackets with commas。
因此您可以执行以下操作:
QuantumNumbers
首先将import re
with open('f.txt') as fh:
s = re.sub(r' (?=[^\(\)]*\))', ',', fh.read().strip())
rows = [r.split() for r in s.split('\n')]
计算为:
s
,然后version=2
id NumCompo Species QuantumNumbers Frequency Eup Gup Aij FitFreq DeltaFitFreq Vo deltaVo FWHM_G deltaFWHM_G FWHM_L deltaFWHM_L Intensity deltaIntensity FitFlux deltaFitFlux Freq.IntensityMax V.IntensityMax FWHM IntensityMax Flux1stMom deltaFlux1stMom rms deltaV Cal Size TelescopePath TelescopeName
None None None None MHz K None s-1 MHz MHz km/s km/s km/s km/s km/s km/s K K K.km/s K.km/s MHz km/s km/s K K.km/s K.km/s mK km/s % arcsec None None
44003 1 CH3CHO (18,1,17,2,_,17,1,16,2) 350362.8435 163.4598498101699 74 0.0014741376966675667 350355.5848769065 Infinity 6.210933891166498 Infinity 1.739817511288065 Infinity 0.0 0.0 2.8623661141900496 Infinity 5.301075803032265 0.0 350355.5 6.2835599041722 1.7199802504879502 2.848570585251 4.899485148752622 0.0 3.854414571567953E-5 0.8289084198960507 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (18,1,17,0,_,17,1,16,0) 350445.7777 163.41850853869101 74 0.0014742735891251069 350437.70831029414 0.12591692133973328 6.903042719892636 0.10771941048880102 2.203766561226652 0.2947187307742186 0.0 0.0 3.482121868378891 0.34484851265307565 8.168010359688692 0.0 350437.5 7.08124391130517 2.11135392209597 3.597269296646 7.595108638308943 0.0 204.05560763773454 0.8287143946045267 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (13,2,12,1,_,12,1,11,2) 350572.1804 93.02188980281947 54 9.699686188970169E-5 350566.94541642064 NaN 4.476706032550229 NaN 0.4589727274204179 NaN 0.0 0.0 23.273694629520087 NaN 11.372220042377085 0.0 350566.40625 4.9377752090695495 1.3912897271407283 1.418276190758 1.9732330944498893 0.0 425.46913502384274 0.8284143332887641 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (8,6,3,3,_,9,5,4,3) 350808.1122 318.0348265703963 34 1.075967688918428E-5 350801.2794264813 Infinity 5.839129418307394 Infinity 0.565736741450577 Infinity 0.0 0.0 11.264418715889377 Infinity 6.784303066688616 0.0 350801.75 5.436988227894669 1.3717772790578981 2.775228977203 3.806996055110165 0.0 156.0928146251678 0.8412065022978162 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (8,6,2,3,_,9,5,5,3) 350808.1275 318.03482730468073 34 1.0759677371497151E-5 350801.2795084328 Infinity 5.852134153594954 Infinity 0.5663191176333013 Infinity 0.0 0.0 11.228477212030814 Infinity 6.769618049955546 0.0 350801.75 5.450063014554457 1.3717772790578981 2.775228977203 3.806996055110165 0.0 156.8862242548636 0.8412065022978162 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
作为列表列表:
rows
然后可以使用语法[['version=2']
['id', 'NumCompo', 'Species', 'QuantumNumbers', 'Frequency', 'Eup', 'Gup', 'Aij', 'FitFreq', 'DeltaFitFreq', 'Vo', 'deltaVo', 'FWHM_G', 'deltaFWHM_G', 'FWHM_L', 'deltaFWHM_L', 'Intensity', 'deltaIntensity', 'FitFlux', 'deltaFitFlux', 'Freq.IntensityMax', 'V.IntensityMax', 'FWHM', 'IntensityMax', 'Flux1stMom', 'deltaFlux1stMom', 'rms', 'deltaV', 'Cal', 'Size', 'TelescopePath', 'TelescopeName']
['None', 'None', 'None', 'None', 'MHz', 'K', 'None', 's-1', 'MHz', 'MHz', 'km/s', 'km/s', 'km/s', 'km/s', 'km/s', 'km/s', 'K', 'K', 'K.km/s', 'K.km/s', 'MHz', 'km/s', 'km/s', 'K', 'K.km/s', 'K.km/s', 'mK', 'km/s', '%', 'arcsec', 'None', 'None']
['44003', '1', 'CH3CHO', '(18,1,17,2,_,17,1,16,2)', '350362.8435', '163.4598498101699', '74', '0.0014741376966675667', '350355.5848769065', 'Infinity', '6.210933891166498', 'Infinity', '1.739817511288065', 'Infinity', '0.0', '0.0', '2.8623661141900496', 'Infinity', '5.301075803032265', '0.0', '350355.5', '6.2835599041722', '1.7199802504879502', '2.848570585251', '4.899485148752622', '0.0', '3.854414571567953E-5', '0.8289084198960507', '0.0', '0.0', '/home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/', 'alma_400m']
['44003', '1', 'CH3CHO', '(18,1,17,0,_,17,1,16,0)', '350445.7777', '163.41850853869101', '74', '0.0014742735891251069', '350437.70831029414', '0.12591692133973328', '6.903042719892636', '0.10771941048880102', '2.203766561226652', '0.2947187307742186', '0.0', '0.0', '3.482121868378891', '0.34484851265307565', '8.168010359688692', '0.0', '350437.5', '7.08124391130517', '2.11135392209597', '3.597269296646', '7.595108638308943', '0.0', '204.05560763773454', '0.8287143946045267', '0.0', '0.0', '/home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/', 'alma_400m']
['44003', '1', 'CH3CHO', '(13,2,12,1,_,12,1,11,2)', '350572.1804', '93.02188980281947', '54', '9.699686188970169E-5', '350566.94541642064', 'NaN', '4.476706032550229', 'NaN', '0.4589727274204179', 'NaN', '0.0', '0.0', '23.273694629520087', 'NaN', '11.372220042377085', '0.0', '350566.40625', '4.9377752090695495', '1.3912897271407283', '1.418276190758', '1.9732330944498893', '0.0', '425.46913502384274', '0.8284143332887641', '0.0', '0.0', '/home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/', 'alma_400m']
['44003', '1', 'CH3CHO', '(8,6,3,3,_,9,5,4,3)', '350808.1122', '318.0348265703963', '34', '1.075967688918428E-5', '350801.2794264813', 'Infinity', '5.839129418307394', 'Infinity', '0.565736741450577', 'Infinity', '0.0', '0.0', '11.264418715889377', 'Infinity', '6.784303066688616', '0.0', '350801.75', '5.436988227894669', '1.3717772790578981', '2.775228977203', '3.806996055110165', '0.0', '156.0928146251678', '0.8412065022978162', '0.0', '0.0', '/home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/', 'alma_400m']
['44003', '1', 'CH3CHO', '(8,6,2,3,_,9,5,5,3)', '350808.1275', '318.03482730468073', '34', '1.0759677371497151E-5', '350801.2795084328', 'Infinity', '5.852134153594954', 'Infinity', '0.5663191176333013', 'Infinity', '0.0', '0.0', '11.228477212030814', 'Infinity', '6.769618049955546', '0.0', '350801.75', '5.450063014554457', '1.3717772790578981', '2.775228977203', '3.806996055110165', '0.0', '156.8862242548636', '0.8412065022978162', '0.0', '0.0', '/home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/', 'alma_400m']]
进行索引。
答案 1 :(得分:0)
看起来您有三行标题,后跟记录,这些记录由不同的空格分隔,并且本身不包含空格,但QuantumNumbers字段除外,该字段的括号内包含九个数字。您可以通过以下方法解决此问题:
像这样:
import re
text = """version=2
id NumCompo Species QuantumNumbers Frequency Eup Gup Aij FitFreq DeltaFitFreq Vo deltaVo FWHM_G deltaFWHM_G FWHM_L deltaFWHM_L Intensity deltaIntensity FitFlux deltaFitFlux Freq.IntensityMax V.IntensityMax FWHM IntensityMax Flux1stMom deltaFlux1stMom rms deltaV Cal Size TelescopePath TelescopeName
None None None None MHz K None s-1 MHz MHz km/s km/s km/s km/s km/s km/s K K K.km/s K.km/s MHz km/s km/s K K.km/s K.km/s mK km/s % arcsec None None
44003 1 CH3CHO (18 1 17 2 _ 17 1 16 2) 350362.8435 163.4598498101699 74 0.0014741376966675667 350355.5848769065 Infinity 6.210933891166498 Infinity 1.739817511288065 Infinity 0.0 0.0 2.8623661141900496 Infinity 5.301075803032265 0.0 350355.5 6.2835599041722 1.7199802504879502 2.848570585251 4.899485148752622 0.0 3.854414571567953E-5 0.8289084198960507 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (18 1 17 0 _ 17 1 16 0) 350445.7777 163.41850853869101 74 0.0014742735891251069 350437.70831029414 0.12591692133973328 6.903042719892636 0.10771941048880102 2.203766561226652 0.2947187307742186 0.0 0.0 3.482121868378891 0.34484851265307565 8.168010359688692 0.0 350437.5 7.08124391130517 2.11135392209597 3.597269296646 7.595108638308943 0.0 204.05560763773454 0.8287143946045267 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (13 2 12 1 _ 12 1 11 2) 350572.1804 93.02188980281947 54 9.699686188970169E-5 350566.94541642064 NaN 4.476706032550229 NaN 0.4589727274204179 NaN 0.0 0.0 23.273694629520087 NaN 11.372220042377085 0.0 350566.40625 4.9377752090695495 1.3912897271407283 1.418276190758 1.9732330944498893 0.0 425.46913502384274 0.8284143332887641 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (8 6 3 3 _ 9 5 4 3) 350808.1122 318.0348265703963 34 1.075967688918428E-5 350801.2794264813 Infinity 5.839129418307394 Infinity 0.565736741450577 Infinity 0.0 0.0 11.264418715889377 Infinity 6.784303066688616 0.0 350801.75 5.436988227894669 1.3717772790578981 2.775228977203 3.806996055110165 0.0 156.0928146251678 0.8412065022978162 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m
44003 1 CH3CHO (8 6 2 3 _ 9 5 5 3) 350808.1275 318.03482730468073 34 1.0759677371497151E-5 350801.2795084328 Infinity 5.852134153594954 Infinity 0.5663191176333013 Infinity 0.0 0.0 11.228477212030814 Infinity 6.769618049955546 0.0 350801.75 5.450063014554457 1.3717772790578981 2.775228977203 3.806996055110165 0.0 156.8862242548636 0.8412065022978162 0.0 0.0 /home/dipen/Downloads/cassis3.9-160426-build6032/delivery/telescope/ alma_400m"""
records = text
lines = records.split("\n")
fields = []
WHITESPACE = re.compile(r"\s+")
for line in lines[3:]:
current = WHITESPACE.split(line)
current[3:12] = [" ".join(current[3:12])]
fields.append(current)
print(fields)