在python 3中连接文件

时间:2018-05-21 21:32:14

标签: python python-3.x concatenation

假设我有两个文件,file1.txt,file2.txt。

file1.txt是以下

TITLE   MEARA Repeatv2 Run2 
DATA TYPE       
ORIGIN  JASCO   
OWNER       
DATE    18/03/08    
TIME    22:07:45    
SPECTROMETER/DATA SYSTEM    JASCO Corp., J-715, Rev. 1.00   
RESOLUTION      
DELTAX  -0.1    
XUNITS  NANOMETERS  
YUNITS  CD[mdeg]    
    HT[V]   
FIRSTX  260 
LASTX   200 
NPOINTS 601 
FIRSTY  -4.70495    
MAXY    -4.70277    
MINY    -41.82113   
XYDATA      
260.0   -4.70495    443.669
259.9   -4.70277    443.672
259.8   -4.70929    443.674
259.7   -4.72508    443.681
259.6   -4.72720    443.69

file2.txt是这样的:

TITLE   MEARA Repeatv2 Run2 
DATA TYPE       
ORIGIN  JASCO   
OWNER       
DATE    18/03/08    
TIME    22:30:34    
SPECTROMETER/DATA SYSTEM    JASCO Corp., J-715, Rev. 1.00   
RESOLUTION      
DELTAX  -0.1    
XUNITS  NANOMETERS  
YUNITS  CD[mdeg]    
    HT[V]   
FIRSTX  260 
LASTX   200 
NPOINTS 601 
FIRSTY  -4.76564    
MAXY    -3.51295    
MINY    -41.95971   
XYDATA      
260 -4.76564    443.152
259.9   -4.77382    443.155
259.8   -4.78663    443.156
259.7   -4.8017 443.162
259.6   -4.83604    443.174

我编写了以下Python脚本来连接这两个文件。

def catFiles(names, outName):
    with open(outName, 'w') as outfile:
        for fname in names:
            fileName=('/'+str(fname))
            with open(fname) as infile:
                outfile.write(infile.read())

虽然这个脚本可以连接两个文件, 它将文件堆叠在一起,以便一个文件接着一个接一个。 我想知道如何修改它或重写它,使文件堆叠在一起;这样我得到以下输出

TITLE   MEARA Repeatv2 Run2     TITLE   MEARA Repeatv2 Run2 
DATA TYPE           DATA TYPE       
ORIGIN  JASCO       ORIGIN  JASCO   
OWNER           OWNER       
DATE    18/03/08        DATE    18/03/08    
TIME    22:07:45        TIME    22:30:34    
SPECTROMETER/DATA SYSTEM    JASCO Corp., J-715, Rev. 1.00       SPECTROMETER/DATA SYSTEM    JASCO Corp., J-715, Rev. 1.00   
RESOLUTION          RESOLUTION      
DELTAX  -0.1        DELTAX  -0.1    
XUNITS  NANOMETERS      XUNITS  NANOMETERS  
YUNITS  CD[mdeg]        YUNITS  CD[mdeg]    
    HT[V]           HT[V]   
FIRSTX  260     FIRSTX  260 
LASTX   200     LASTX   200 
NPOINTS 601     NPOINTS 601 
FIRSTY  -4.70495        FIRSTY  -4.76564    
MAXY    -4.70277        MAXY    -3.51295    
MINY    -41.82113       MINY    -41.95971   
XYDATA          XYDATA      
260.0   -4.70495    443.669 260.0   -4.76564    443.152
259.9   -4.70277    443.672 259.9   -4.77382    443.155
259.8   -4.70929    443.674 259.8   -4.78663    443.156
259.7   -4.72508    443.681 259.7   -4.80170    443.162
259.6   -4.72720    443.690 259.6   -4.83604    443.174

2 个答案:

答案 0 :(得分:3)

from itertools import zip_longest

with open('file1.txt') as f1, open('file2.txt') as f2, open('out.txt', 'w') as f:
    for left, right in zip_longest(f1, f2, fillvalue='\n'):
        f.write(left.rstrip('\n') + right)

答案 1 :(得分:1)

文本文件实际上没有两个维度(宽度和高度),就像在文本编辑器中查看它时一样。它实际上只有一个维度。

例如,这个文件:

first line
second line
third line

实际上包含一个包含两个换行符(\n)字符的字符串:

'first line\nsecond line\nthird line'

现在,让我们将其与另一个包含以下内容的文件合并:

blue
cheese

(或:'blue\ncheese'

你称之为垂直的正常方式只是对字符串求和:

'first line\nsecond line\nthird lineblue\ncheese'

你想要的是更复杂的东西,即合并每一行(也可能增加一些间距):

'第一行蓝色\ n第二行干酪\第三行'

直接在两个大字符串的水平上执行此操作是不可能的,因此您需要:

  • 将每个文件拆分为行列表(例如['first line', 'second line', 'third line']['blue', 'cheese']
  • 将第一个文件的每一行与第二个文件的相应行合并(例如'first line' + ' ' + 'blue'
  • 处理多余的行,因为一个文件可能更长(例如'third line' + ''
  • 合并行

以下是如何做到这一点,一步一步:

要将文件作为行读取,您可以执行f.read().splitlines(),但最好是f.readlines()或只是迭代文件对象(for line in f: ...

要匹配两个文件的相应行,您可以使用zip_longest

for left_line, right_line in zip_longest(left_lines, right_lines):
    ...

要使用填充连接:     '{} {}'。format(left_line,right_line)

总之,详细:

left_lines = []
with open(left_filename, 'rt') as left_file:
    for line in left_file:
        line_without_newline = line.strip('\n')
        left_lines.append(line_without_newline)

right_lines = []
with open(right_filename, 'rt') as right_file:
    for line in right_file:
        line_without_newline = line.strip('\n')
        right_lines.append(line_without_newline)

merged_lines = []
for left_line, right_line in zip_longest(left_lines, right_lines, fillvalue=''):
    merged_lines.append('{}    {}'.format(left_line, right_line))

with open(output_filename, 'wt') as output_file:
    for merged_line in merged_lines:
        output_file.write(merged_line + '\n')

现在,您可以跳过大多数中间步骤,使其更简单:)

with open(left_filename, 'rt') as left_file,\
     open(right_filename, 'rt') as right_file,\
     open(output_filename, 'wt') as output_file:
    for left_line, right_line in zip_longest(left_file, right_file, fillvalue=''):
        output_file.write('{}    {}\n'.format(left_line.strip('\n'),
                                              right_line.strip('\n')))