我位于一个包含36个不同文件夹的目录中。每个文件夹中都有一个单独的csv。我想将所有这些附加在一起,以在python中制作一个大数据框。
在R中,我会这样做:
cwd = getwd() #get current directory
fil = list.files() #get list of all files/folders in the directory
Bigdf = NULL #initialize empty df
for(i in fil){ #read through all folders in current directory
setwd(paste0(cwd,'/',i)) #navigate to i'th folder
fil2 = list.files() #get list of files in i'th folder
for(j in fil2){
a = read.csv(paste0(cwd,'/',i,'/',j)) #read in all csv's
Bigdf = rbind(Bigdf,a[,c(2,4:11)]) #append desired columns to data frame
}
setwd(cwd)
}
我该如何在python中做类似的事情?
我尝试实现How can I read the contents of all the files in a directory with pandas?和How do I list all files of a directory?,但无济于事。我想我缺少明显的东西,希望有人能指出我正确的方向。
答案 0 :(得分:0)
import glob
import pandas as pd
li =[]
for filename in glob.iglob('src/**/*.csv', recursive=True):
df = pd.read_csv(filename, index_col=None, header=0)
li.append(df)
frame = pd.concat(li, axis=0, ignore_index=True)
组合
Import multiple csv files into pandas and concatenate into one DataFrame
和