我有大量的csv文件,所有文件都具有相同的格式。我需要遍历所有这些对象,并选择“中位数”列(第4列)并将其写入新文件,然后将它们全部组合在一起。
它们的格式如下。
Wind_Speed Average Median Power_Curve Difference
1 0.0 NaN NA 0 NaN
2 0.5 NaN NA 0 NaN
3 1.0 NaN NA 0 NaN
4 1.5 NaN NA 0 NaN
5 2.0 NaN NA 0 NaN
6 2.5 14.12 14.12 24 -9.9
7 3.0 31.02 31.51 48 -17.0
8 3.5 55.06 57.12 96 -40.9
9 4.0 106.70 109.89 192 -85.3
10 4.5 178.13 180.76 288 -109.9
11 5.0 277.68 278.57 408 -130.3
12 5.5 401.91 400.41 540 -138.1
13 6.0 568.38 569.73 696 -127.6
14 6.5 765.16 762.98 912 -146.8
15 7.0 999.09 1002.82 1104 -104.9
16 7.5 1222.77 1216.91 1332 -109.2
17 8.0 1460.55 1463.50 1524 -63.4
18 8.5 1601.32 1597.00 1656 -54.7
19 9.0 1658.94 1664.40 1680 -21.1
20 9.5 1662.15 1667.81 1692 -29.9
21 10.0 1661.49 1665.47 1692 -30.5
22 10.5 1659.75 1663.02 1692 -32.2
23 11.0 1660.59 1661.13 1692 -31.4
24 11.5 1660.18 1659.44 1692 -31.8
25 12.0 1662.33 1666.21 1692 -29.7
26 12.5 1661.55 1661.10 1692 -30.5
27 13.0 1667.06 1677.50 1692 -24.9
28 13.5 1660.06 1661.63 1692 -31.9
29 14.0 1671.95 1686.82 1692 -20.0
30 14.5 1675.67 1687.73 1692 -16.3
31 15.0 1672.57 1685.97 1692 -19.4
32 15.5 1666.96 1673.73 1692 -25.0
33 16.0 1670.11 1681.58 1692 -21.9
34 16.5 1669.24 1686.14 1692 -22.8
35 17.0 1669.85 1677.95 1692 -22.1
36 17.5 1656.20 1644.46 1692 -35.8
37 18.0 1687.57 1687.57 1692 -4.4
38 18.5 1691.64 1691.69 1692 -0.4
39 19.0 1681.02 1686.78 1692 -11.0
40 19.5 1689.79 1689.79 1692 -2.2
41 20.0 NaN NA 1692 NaN
理想情况下,新文件中的新列名应为旧文件名。
我知道它正在像下面那样工作,但是我不知道如何在下一列的新表中编写该列并继续进行ii。
files2 <- list.files(path="~/test2",pattern="*.csv", full.names=TRUE, recursive=FALSE)
for(ii in files2){
titlename<- tools::file_path_sans_ext(basename(files2))
mydata2 <-read.csv(ii, header = T, stringsAsFactors=FALSE)
mydata2<- mydata2[,4]
???
}
答案 0 :(得分:0)
setwd()#set path to where files are
csv_files<-list.files(pattern = "*.csv") #list csv files in path
temp<-NULL #set empty object
for(i in csv_files){
temp[i]<-read.csv(i)[4]# number 4 is the column you want to select, set to what you want..
names(temp)<-stringr::str_remove(names(temp),".csv") #use this line if you want to remove.csv from column name in combined csv
write.csv(temp,"combined.csv",row.names = F)# write combined csv
}
这似乎对我有用。
答案 1 :(得分:0)
使用base-R和lapply的替代方法:
file <- list.files(path = "~/path", pattern = "\\.csv")
自定义函数,用于读取csv,提取文件名并分配给列。 (有时在read.csv中粘贴路径可能会在这些循环中导致路径错误)
read_files_assign_filename <- function(filename){
item <- read.csv(paste("~/path", filename, sep = "/"), header = TRUE)[4]
colnames(item) <- substr(filename,0,nchar(filename)-4) #remove.csv
item #return item
}
包好包皮,包扎成一体。
final_result <- do.call(cbind, lapply(files, read_files_assign_filename))
希望能有所帮助/起作用!