在不同的子目录中合并具有相同名称的数千个csv

时间:2019-07-01 16:37:36

标签: python-2.7

我有1000个带有三个不同csv文件(rand.csv,run_error.csv,swe_error.csv)的子目录(error1-error1000)。每个vsc都有索引行。我需要合并具有相同文件名的csv文件,所以最终得到例如rand_merge.csv具有索引行和1000行数据。

我关注了Merge multiple csv files with same name in 10 different subdirectory,这使我

  

KeyError:“文件名”

我不知道如何解决它,因此对您有所帮助。 谢谢

更新:这是确切的代码,来自上面的链接文章:

import pandas as pd
import glob

CONCAT_DIR = "./error/files_concat/"

# Use glob module to return all csv files under root directory. Create DF from this.
files = pd.DataFrame([file for file in glob.glob("error/*/*")], columns=["fullpath"])


# Split the full path into directory and filename
files_split = files['fullpath'].str.rsplit("\\", 1, expand=True).rename(columns={0: 'path', 1:'filename'})


# Join these into one DataFrame
files = files.join(files_split)


# Iterate over unique filenames; read CSVs, concat DFs, save file
for f in files['filename'].unique():
    paths = files[files['filename'] == f]['fullpath'] # Get list of fullpaths from unique filenames
    dfs = [pd.read_csv(path, header=None) for path in paths] # Get list of dataframes from CSV file paths
    concat_df = pd.concat(dfs) # Concat dataframes into one
    concat_df.to_csv(CONCAT_DIR + f) # Save dataframe

1 个答案:

答案 0 :(得分:0)

我发现了我的错误。 rsplit之后,我需要一个“ /”,而不是“ \”

files_split = files['fullpath'].str.rsplit("/", 1, expand=True).rename(columns={0: 'path', 1:'filename'})