如何从文本文件中提取数据并导入新的输出文件?

时间:2013-04-10 18:34:44

标签: python file-io python-2.7

假设以下是ctf_output.txt中的数据,我想从Rod#,Surface Temp和Centerline Temp中提取标题标题,然后只为每个和每个棒提取最高温度。请注意,我在每列中放置了特定且明显更高的临界值而没有重复。

Rod 1  
Surface Temperature Centerline Temperature  
500         510    
501         511  
502         512  
503         513  
504         525  
505         515  
535         516  
507         517  
508         518  
509         519  
510         520  
Rod 2  
Surface Temperature   Centerline Temperature  
500               510  
501           511  
502           512  
503           513  
504           555  
505           515  
540           516  
507               517  
508           518  
509           519  
510           520  
Rod 3
Surface Temperature   Centerline Temperature  
500           510  
501           511  
502           512  
503           513  
567           514  
505           515  
506           559  
507           517  
508           518  
509           519  
510           520  

我怎么能用python做到这一点?我需要一个python脚本来提取数据并使用以下格式填充新的输出文件:

Rod 1  
Surface Temperature Centerline Temperature  
535         525  

Rod 2  
Surface Temperature   Centerline Temperature  
540           555  

Rod 3  
Surface Temperature   Centerline Temperature  
567           559  

1 个答案:

答案 0 :(得分:3)

你逐行读取文件,然后跟踪最大值并在下一节开始时输出:

with open('ctf_output.txt', 'r') as temps, open(outputfilename, 'w') as output:
    surface_max = centerline_max = None
    for line in temps:
        if line.startswith('Rod'):
            # start of new section
            if surface_max is not None or centerline_max is not None:
                # write maximum for previous section
                output.write('{}\t\t\t{}\n\n'.format(surface_max, centerline_max))
            # write out this line and the next to the output file
            output.write(line)
            output.write(next(temps, ''))
            # reset maxima
            surface_max = centerline_max = 0
        elif line.strip():
            # temperature line; read temperatures and track maxima
            surface, centerline = [int(t) for t in line.split()]
            if surface > surface_max:
                surface_max = surface
            if centerline > centerline_max:
                centerline_max = centerline
    if surface_max or centerline_max:
        # write out last maxima
        output.write('{}\t\t\t{}\n'.format(surface_max, centerline_max))

输出使用3个标签,就像您的输入一样。

对于您的示例输入,它写道:

Rod 1  
Surface Temperature Centerline Temperature  
535         525

Rod 2  
Surface Temperature   Centerline Temperature  
540         555

Rod 3
Surface Temperature   Centerline Temperature  
567         559