在python中串联2个文件的问题

时间:2019-07-24 12:39:22

标签: python file concatenation

我想在一个文件中附加两个ASCII文件(例如F1_Jan_01.txtF1_jan_01.txt,分别包含在目录d01d02中)。实际上,我有两个目录,每个目录中都有文件(F1F2F3),月和日(1到7),我想附加文件具有相同名称的名称位于两个不同的目录中。因此,我用Python编写了以下代码。

import pandas as pd

maindir1="/home/d01/"
maindir2="/home/d02/"
outdir="/home/final/"

pol=[ "F1","F2","F3" ]

month=["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"]

for iis,ipol in enumerate(pol):
    for jjs,imonth in enumerate(month):
        for kk in range(1,7,1):
            df1 = pd.read_csv(maindir1+str(ipol)+"_"+str(imonth)+"_0"+str(kk)+".txt", sep="\t")
            df2 = pd.read_csv(maindir2+str(ipol)+"_"+str(imonth)+"_0"+str(kk)+".txt", sep="\t")
            df = pd.concat([ df1, df2 ], ignore_index=True)
            df.to_csv(outdir+str(ipol)+"_"+str(imonth)+"_0"+str(kk)+".txt",sep="\t",index=False)

问题在于,在最终输出中,当它追加第二个文件时,不会写入其第一行。例如,第一个文件(在d01中)具有100000行,第二个文件(在d02中)50000。因此,在最终输出中,正确地写入前100000行,然后附加49000第二个文件的第一行除外。

我是否需要在代码中定义其他任何内容?

1 个答案:

答案 0 :(得分:3)

在不使用Pandas的情况下,以下是等效代码。 (干编码,YMMV。)

maindir1 = "/home/d01/"
maindir2 = "/home/d02/"
outdir = "/home/final/"

pols = ["F1", "F2", "F3"]
months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]

for ipol in pols:
    for imonth in months:
        for kk in range(1, 7):
            template_args = {"ipol": ipol, "imonth": imonth, "kk": kk}
            filename = "{ipol}_{imonth}_0{kk}.txt".format(ipol=ipol, imonth=imonth, kk=kk)
            out_name = os.path.join(outdir, filename)
            in_names = [os.path.join(maindir1, filename), os.path.join(maindir2, filename)]
            with open(out_name, "w") as out_file:
                for in_name in in_names:
                    with open(in_name, "r") as in_file:
                        out_file.write(in_file.read())