从文件头获取特定元素?

时间:2015-07-30 14:48:48

标签: python python-2.7

所以我有这种格式的数据:

#NUMBER OF AGES=37 MAGS= 5
#----------------------------------------------------       
#MIX-LEN  Y      Z          Zeff        [Fe/H] [a/Fe]
# 1.9380  0.3300 3.1125E-04 1.5843E-04  -1.99   0.40
#----------------------------------------------------       
#**PHOTOMETRIC SYSTEM**: SDSS (AB)
#----------------------------------------------------       
#AGE= 1.000 EEPS=276
#EEP   M/Mo    LogTeff  LogG   LogL/Lo sdss_u  sdss_g  sdss_r  sdss_i  sdss_z  
   3  0.105051  3.5416  5.3073 -2.7281 17.2434 13.9245 12.2511 11.5059 11.0920
   4  0.118159  3.5519  5.2644 -2.5931 16.5471 13.4392 11.8286 11.1502 10.7665
   5  0.133632  3.5623  5.2211 -2.4547 15.8825 12.9522 11.4091 10.7861 10.4301
   6  0.156724  3.5737  5.1745 -2.2933 15.1633 12.4008 10.9372 10.3638 10.0348
   7  0.194206  3.5861  5.1259 -2.1020 14.3673 11.7681 10.3975  9.8690  9.5654

#AGE= 1.250 EEPS=275
#EEP   M/Mo    LogTeff  LogG   LogL/Lo sdss_u  sdss_g  sdss_r  sdss_i  sdss_z  
   4  0.105349  3.5419  5.3062 -2.7248 17.2255 13.9125 12.2406 11.4972 11.0840
   5  0.118453  3.5521  5.2635 -2.5901 16.5322 13.4285 11.8194 11.1424 10.7593
   6  0.133982  3.5625  5.2202 -2.4518 15.8684 12.9418 11.4002 10.7783 10.4228
   7  0.157170  3.5739  5.1736 -2.2903 15.1503 12.3907 10.9286 10.3560 10.0274
   8  0.193935  3.5860  5.1253 -2.1021 14.3683 11.7688 10.3980  9.8694  9.5657
   9  0.233067  3.5953  5.0847 -1.9445 13.7342 11.2608  9.9652  9.4664  9.1794
  10  0.253422  3.5994  5.0635 -1.8706 13.4408 11.0263  9.7653  9.2787  8.9986
  11  0.263033  3.6012  5.0556 -1.8395 13.3177 10.9277  9.6812  9.1997  8.9225
  12  0.269548  3.6022  5.0484 -1.8175 13.2345 10.8597  9.6223  9.1439  8.8684

#AGE= 1.500 EEPS=274
#EEP   M/Mo    LogTeff  LogG   LogL/Lo sdss_u  sdss_g  sdss_r  sdss_i  sdss_z  
   5  0.107932  3.5440  5.2974 -2.6971 17.0780 13.8121 12.1528 11.4242 11.0175
   6  0.121259  3.5542  5.2549 -2.5632 16.3994 13.3328 11.7366 11.0714 10.6940
   7  0.137720  3.5647  5.2112 -2.4220 15.7312 12.8387 11.3119 10.7002 10.3502
   8  0.163277  3.5763  5.1639 -2.2544 14.9976 12.2706 10.8261 10.2628  9.9394
   9  0.199555  3.5876  5.1176 -2.0757 14.2622 11.6834 10.3250  9.8017  9.5010

我已经弄清楚如何使用这个读取列(理论上):

sdss_g = []
sdss_r = []

with open ("/Users/Wilson/research/isochrones/SDSSugriz/fehm20afep4y33.SDSSugriz") as f:
   for _ in xrange(9):
        next(f)

   for line in f: 
        cols = line.split()
        g = (float(cols[6]))
        r = (float(cols[7]))
        print 'g'
        print g
        print 'r'
        print r
        sdss_g.append(g)
        sdss_r.append(r)

但是这不起作用,它给了我这个错误:

Traceback (most recent call last):
  File "isochronereader.py", line 18, in <module>
    g = (float(cols[6]))
IndexError: list index out of range

我很确定这是因为我没有考虑行余额和字符串格式的新标题,但是&#34;列表索引超出范围&#34;似乎不是解决该问题的正确指标。

尽管如此,我不知道如何解释数据之间的标题和空间,我想知道这一点。我还想在每个标题中阅读AGE。由于这个文本文件的格式我不能使用genfromtxt,我无法弄清楚要做什么,特别是因为下一个新标题的行数不统一所以我不能只需在特定行数后读取它。

最后,我希望能够读入sdss_g和sdss_r,并知道每个sdss_g和sdss_r元素的关联年龄。我有一个简单的方法吗?

2 个答案:

答案 0 :(得分:2)

你可以这样做:

sdss_g = []
sdss_r = []

with open ("/Users/Wilson/research/isochrones/SDSSugriz/fehm20afep4y33.SDSSugriz") as f:
    for line in f:
        if not line.isspace() and not line.startswith("#"):
            print line
            cols = line.split()
            g = (float(cols[6]))
            r = (float(cols[7]))
            print 'g'
            print g
            print 'r'
            print r
            sdss_g.append(g)
            sdss_r.append(r)

答案 1 :(得分:2)

如果首先拆分数据,可以使用CSV库:

import csv, re, StringIO

with open("/Users/Wilson/research/isochrones/SDSSugriz/fehm20afep4y33.SDSSugriz", "r") as f_input:
    data = f_input.read()
    lblocks = re.findall(r"#AGE=\s+?(.*?)\s+(?:.*?[\r\n]|\Z){2}(.*?(?=#AGE|\Z|^$))", data, re.S + re.M)

    for age, block in lblocks:
        csv_reader = csv.reader(StringIO.StringIO(block), delimiter=" ", skipinitialspace=True)
        print "Age:", age

        for cols in csv_reader:
            print "  sdss_g %s, sdss_r %s" % (cols[6], cols[7])

这给出了以下输出:

Age: 1.000
  sdss_g 13.9245, sdss_r 12.2511
  sdss_g 13.4392, sdss_r 11.8286
  sdss_g 12.9522, sdss_r 11.4091
  sdss_g 12.4008, sdss_r 10.9372
  sdss_g 11.7681, sdss_r 10.3975
Age: 1.250
  sdss_g 13.9125, sdss_r 12.2406
  sdss_g 13.4285, sdss_r 11.8194
  sdss_g 12.9418, sdss_r 11.4002
  sdss_g 12.3907, sdss_r 10.9286
  sdss_g 11.7688, sdss_r 10.3980
  sdss_g 11.2608, sdss_r 9.9652
  sdss_g 11.0263, sdss_r 9.7653
  sdss_g 10.9277, sdss_r 9.6812
  sdss_g 10.8597, sdss_r 9.6223
Age: 1.500
  sdss_g 13.8121, sdss_r 12.1528
  sdss_g 13.3328, sdss_r 11.7366
  sdss_g 12.8387, sdss_r 11.3119
  sdss_g 12.2706, sdss_r 10.8261
  sdss_g 11.6834, sdss_r 10.3250

使用Python 2.7进行测试