Python:从数组末尾的文件中添加两次日期来自数组

时间:2017-07-28 14:25:37

标签: python arrays multiple-files

所以我需要一些具体问题的帮助。我有.composite(例如:RR0063_0011.composite)文件,其中第二列(强度)被读入一个数组,但我需要在最后两次从单独文件的第二列添加日期(Modified Julian Date)在转换和保存数组之前的每一行。示例输入文件:

数据(。复合)文件:

滨#.....强度

1. -0.234987
2. 0.87734
...
512. -0.65523

修改过的Julian日期文件:

抓住MJD的文件...... MJD

RR0063_0011.profs   55105.07946
RR0023_0061.profs   53495.367377
RR0022_0041.profs   53492.307631

这是将数据读入数组并生成mjd.txt文件的代码。所有这些工作到目前为止,我只需要将MJD 两次添加到相应的.composite行的末尾。现在,我对Python知之甚少,但这是我目前的代码。

#!/usr/bin/python
import sys
import glob
import numpy as np
import os

psrname = sys.argv[1]
file_list = glob.glob('*.composite')

cols = [1] 
data = []
for f in file_list:
    # Split the filename from the extension to use later    
    filename = os.path.splitext('{0}'.format(f))
    data.append(np.loadtxt(f, usecols=cols))
    print data

# Run 'vap' (a PSRCHIVE command) to grap the MJD from the .profs file for each observation and write out to a file called 'mjd.txt'
os.system('vap -nc mjd ../{0}/{0}.profs >> mjd.txt' .format(filename[0]))

# Put the MJDs only (from 'mjd.txt') in an array called mjd_array
mjd_array = np.genfromtxt('mjd.txt', dtype=[('filename_2','S40'),('mjd','f8')])

# Check if working
print mjd_array['mjd'][7]

arr = np.vstack(data)

transposed_arr = np.transpose(arr)
print transposed_arr

fout = np.savetxt(psrname + '.total', transposed_arr, delimiter='   ')

MJD与.composite文件的顺序不一致,最后我需要在保存之前按MJD对所有列进行排序。

感谢您的帮助!

期望的输出:

强度

.....

强度

MJD

MJD

-0.234987
2. 0.87734
...
-0.65523
55105.07946
55105.07946

1 个答案:

答案 0 :(得分:0)

假设您的示例输出中不需要额外的2.(可能是来自示例输入的复制和粘贴错误),您可以先从日期文件中读取日期并将其用作一种查找表:

import os
import numpy as np


# function to read dates from generated mjd file
# and create look-up table (is a list of lists)
def read_mjd_file():
    with open('mjd.txt') as f:
        lines = f.read().splitlines()
    lines = [line.split() for line in lines]        
    return lines


# function for date look-up
# filename is first list element, date is second
def get_date(base_name):
    for l in lines:
        if l[0].startswith(base_name):
            return l[1]


# function to read data from data file
def extract_data(file_name):
    with open(file_name) as f:
        data = np.loadtxt(f, usecols=[1])
    return data


# generate mjd file at first
# os.system(...)


# generate look-up table from mjd file
lines = read_mjd_file()


# walk through all files given in directory and filter for desired file type
for file_name in os.listdir():
    if file_name.endswith('.composite'):
        base_name = file_name.split('.')[0]
        date = get_date(base_name)
        data = extract_data(file_name)
        out = np.append(data, 2*[date])


print(out)

您可以根据自己的具体需求调整此方法,因为这只是一个概念验证,以便为您提供一些提示。就个人而言,我更喜欢os.listdir()glob.glob()。此外,我认为您不需要使用numpy来完成这项相当简单的任务。 Python的标准csv模块也应该完成这项工作。但是,numpy的功能要舒服得多。因此,如果您需要numpy来完成其他任务,您可以保留它。如果没有,使用csv模块重写代码段应该不是什么大问题。

mjd.txt看起来像:

RR0063_0011.profs   55105.07946
RR0023_0061.profs   53495.367377
RR0022_0041.profs   53492.307631

RR0023_0061.composite看起来像:

1. -0.234987
2. 0.87734
512. -0.65523

输出(变量out)是np.array

['-0.234987' '0.87734' '-0.65523' '53495.367377' '53495.367377']