合并来自不同文件夹的文件

时间:2016-02-22 02:57:50

标签: python pandas

我有两个文件夹。文件夹1中的文件如下所示:

Year   Pressure
1995   1.2
1996   2.7
1997   3.1 
1998   5.6

文件夹2中的文件如下所示:

Year   NDVI
1995   1.0
1995   2.8
1995   0.2
1996   1.2
1996   0.9
1997   6.7
1997   5.7
1998   3.4
1998   1.2

每个文件夹中有53个文件。我想基于文件顺序合并它们(它们都有相应的名称,但它们仍然是相同的顺序)

到目前为止我正在使用它:

import pandas as pd
import os

#path to folder 1
pth1=(r'D:\Sheyenne\Grazing_Regressions\NDVI\grazing')
#path to folder 2
pth2=(r'D:\Sheyenne\Grazing_Regressions\NDVI\NDVI')
#output pathway
outfile=(r'D:\Sheyenne\Grazing_Regressions\NDVI\final_merge')

for f in os.listdir(pth1):
    df = pd.read_csv(os.path.join(pth1, f))
    for f2 in os.listdir(pth2):
        df2=pd.read_csv(os.path.join(pth2, f2))
        outpath=os.path.join(outfile, f2)
        finalmerge=pd.merge(df,df2, left_on='Year', right_on='Year', how='right')
        finalmerge.to_csv(outpath)

但它只是将pth1中的最后一个文件合并到pth2

中的所有文件

2 个答案:

答案 0 :(得分:2)

您可以使用单个循环来保持简单:

for f, f2 in zip(os.listdir(pth1),os.listdir(pth2)):
    df = pd.read_csv(os.path.join(pth1, f))
    df2 = pd.read_csv(os.path.join(pth2, f2))

    outpath=os.path.join(outfile, f2)

    finalmerge=pd.merge(df, df2, left_on='Year', right_on='Year', how='right')
    finalmerge.to_csv(outpath)

答案 1 :(得分:0)

我不熟悉pandas,但如果文件的结构顺序相同,那么您可以通过使用csv内置包写入新文件来实现此目的。像

这样的东西
import os
import csv

path_one = 'your/path/here'
path_two = 'your/other_path/here'

one = open(path_one, 'r')
two = open(path_two, 'r')

headers = ['Year', 'NDVI', 'Pressure']
things_to_add = []

for i, line in enumerate(one):
    if i > 0:
        things_to_add.append(line.split(',')[1])


one.close()
ending_file = open('path/to/end/file.csv', 'w')
writer = csv.writer(ending_file)
writer.writerow(headers)

for i, line in enumerate(two):
    if i > 0:
        writer.writerow([line.split(',')[0], line.split(',')[1], things_to_add[i - 1])

two.close()
ending_file.close()