我有1000个带有三个不同csv文件(rand.csv,run_error.csv,swe_error.csv)的子目录(error1-error1000)。每个vsc都有索引行。我需要合并具有相同文件名的csv文件,所以最终得到例如rand_merge.csv具有索引行和1000行数据。
我关注了Merge multiple csv files with same name in 10 different subdirectory,这使我
KeyError:“文件名”
我不知道如何解决它,因此对您有所帮助。 谢谢
更新:这是确切的代码,来自上面的链接文章:
import pandas as pd
import glob
CONCAT_DIR = "./error/files_concat/"
# Use glob module to return all csv files under root directory. Create DF from this.
files = pd.DataFrame([file for file in glob.glob("error/*/*")], columns=["fullpath"])
# Split the full path into directory and filename
files_split = files['fullpath'].str.rsplit("\\", 1, expand=True).rename(columns={0: 'path', 1:'filename'})
# Join these into one DataFrame
files = files.join(files_split)
# Iterate over unique filenames; read CSVs, concat DFs, save file
for f in files['filename'].unique():
paths = files[files['filename'] == f]['fullpath'] # Get list of fullpaths from unique filenames
dfs = [pd.read_csv(path, header=None) for path in paths] # Get list of dataframes from CSV file paths
concat_df = pd.concat(dfs) # Concat dataframes into one
concat_df.to_csv(CONCAT_DIR + f) # Save dataframe
答案 0 :(得分:0)
我发现了我的错误。 rsplit之后,我需要一个“ /”,而不是“ \”
files_split = files['fullpath'].str.rsplit("/", 1, expand=True).rename(columns={0: 'path', 1:'filename'})