我有一个Fortran格式的文本文件(这里是第3行):
00033+3251 A B C? 6.96 5.480" 358 9.12 F0V 0.00 2.28s 1.00: 2MASS, dJ=1.3
00033+3251 Aa Ab Aab S1,E 0.62 0.273m 0 9.28 F0V 11.28 K2 1.68* 0.32* SB 1469
00033+3251 Aab Ac A E* 4.26 0.076" 0 9.12 F0V 0.00 2.00s 0.28* 2008MNRAS.383.1506
和文件格式说明:
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 10 A10 --- WDS WDS(J2000)
12- 14 A3 --- Primary Designation of the primary
16- 18 A3 --- Secondary Designation of the secondary component
20- 22 A3 --- Parent Designation of the parent (1)
24- 29 A6 --- Type Observing technique/status (2)
31- 35 F5.2 d logP ? Logarithm (10) of period in days
37- 44 F8.3 --- Sep Separation or axis
45 A1 --- x_Sep ['"m] Units of sep. (',",m)
47- 49 I3 deg PA Position angle
51- 55 F5.2 mag Vmag1 V-magnitude of the primary
57- 61 A5 --- SP1 Spectral type of the primary
63- 67 F5.2 mag Vmag2 V-magnitude of the secondary
69- 73 A5 --- SP2 Spectral type of the secondary
75- 79 F5.2 solMass Mass1 Mass of the primary
80 A1 --- MCode1 Mass estimation code for primary (3)
82- 86 F5.2 solMass Mass2 Mass of the secondary
87 A1 --- MCode2 Mass estimation code for secondary (3)
89-108 A20 --- Rem Remark
如何在Python中读取我的文件。我在read_fwf库中找到了pandas函数。
import pandas as pd
filename = 'systems'
columns = ((0,10),(11,14),(15,18),(19,22),(23,29),(30,35),(36,44),(45,45),(46,49),(50,55),(56,61),(62,67),(68,73),(74,79),(80,80),(81,86),(87,87),(88,108))
data = pd.read_fwf(filename, colspecs = columns, header=None)
这是唯一可行且有效的方法吗?我希望我能在没有pandas的情况下做到这一点。你有什么建议吗?
答案 0 :(得分:3)
columns = ((0,10),(11,14),(15,18),(19,22),(23,29),(30,35),
(36,44),(44,45),(46,49),(50,55),(56,61),(62,67),
(68,73),(74,79),(79,80),(81,86),(86,87),(88,108))
string=file.readline()
dataline = [ string[c[0]:c[1]] for c in columns ]
注意列索引是(startbyte-1,endbyte),因此单个字符字段是 例如:(44,45)
这会给你一个字符串列表。你可能想要转换为浮点数,整数等。这个主题有很多问题。
答案 1 :(得分:2)
可以使用astropy表读取此类型的文件。您显示的标题看起来很像CDS格式的ascii表,它具有为其实现的特定读取器:
http://astropy.readthedocs.org/en/latest/api/astropy.io.ascii.Cds.html#astropy.io.ascii.Cds
答案 2 :(得分:1)
有一个模块FortranRecordReader但它与现代fortran文件包含的星号,注释等相比较弱。不过,对于一个不错的文件,它与namedtuple结合使用很有用。例如:
from fortranformat import FortranRecordReader
fline=FortranRecordReader('(a1,i3,i5,i5,i5,1x,a3,a4,1x,f13.5,f11.5,f11.3,f9.3,1x,a2,f11.3,f9.3,1x,i3,1x,f12.5,f11.5)')
from collections import namedtuple
record=namedtuple('nucleo','cc NZ N Z A el o massexcess uncmassex binding uncbind B beta uncbeta am_int am_float uncatmass')
f=open('AME2012.mas12.ff','r')
for line in f:
nucl=record._make(fline.read(line))
你也可以尝试模块“解析”,或写你的