现在我有下面的R代码。它读入的数据如下所示:
track_id day hour month year rate gate_id pres_inter vmax_inter
9 10 0 7 1 9.6451E-06 2 97809 23.545
9 10 0 7 1 9.6451E-06 17 100170 13.843
10 3 6 7 1 9.6451E-06 2 96662 31.568
13 22 12 8 1 9.6451E-06 1 94449 48.466
13 22 12 8 1 9.6451E-06 17 96749 30.55
16 13 0 8 1 9.6451E-06 4 98702 19.205
16 13 0 8 1 9.6451E-06 16 98585 18.143
19 27 6 9 1 9.6451E-06 9 98838 20.053
19 27 6 9 1 9.6451E-06 17 99221 17.677
30 13 12 6 2 9.6451E-06 2 97876 27.687
30 13 12 6 2 9.6451E-06 16 99842 18.163
32 20 18 6 2 9.6451E-06 1 99307 17.527
##################################################################
# Input / Output variables
##################################################################
for (N in (59:96)){
if (N < 10){
# TrackID <- "000$N"
TrackID <- paste("000",N, sep="")
}
else{
# TrackID <- "00$N"
TrackID <- paste("00",N, sep="")
}
print(TrackID)
# For 2010_08_24 trackset
# fname_in <- paste('input/2010_08_24/intersections_track_calibrated_jma_from1951_',TrackID,'.csv', sep="")
# fname_out <- paste('output/2010_08_24/tracks_crossing_regional_polygon_',TrackID,'.csv', sep="")
# For 2012_05_01 trackset
fname_in <- paste('input/2012_05_01/intersections_track_param_',TrackID,'.csv', sep="")
fname_out <- paste('output/2012_05_01/tracks_crossing_regional_polygon_',TrackID,'.csv', sep="")
fname_out2 <- paste('output/2012_05_01/GateID_',TrackID,'.csv', sep="")
#######################################################################
# we read the gate crossing track date
cat('reading the crosstat output file', fname_in, '\n')
header <- read.table(fname_in, nrows=1)
track <- read.table(fname_in, sep=',', skip=1)
colnames(track) <- c("ID", "day", "month", "year", "hour", "rate", "gate_id", "pres_inter", "vmax_inter")
# track_id=track[,1]
# pres_inter=track[,15]
# Function to select maximum surge by stormID
ByTrack <- ddply(track, "ID", function(x) x[which.max(x$vmax_inter),])
ByGate <- count(track, vars="gate_id")
# Write the output file with a single record per storm
cat('Writing the full output file', fname_out, '\n')
write.table(ByTrack, fname_out, col.names=T, row.names=F, sep = ',')
# Write the output file with a single record per storm
cat('Writing the full output file', fname_out2, '\n')
write.table(ByGate, fname_out2, col.names=T, row.names=F, sep = ',')
}
我的代码最后一部分的输出是按GateID分组的文件,并输出出现的频率。它看起来像这样:
gate_id freq
1 935
2 2096
3 1363
4 963
5 167
6 17
7 43
8 62
9 208
10 267
11 64
12 162
13 178
14 632
15 807
16 2003
17 838
18 293
问题是我为96个不同的输入文件输出了一个看起来像这样的文件。我想要计算每个输入文件的这些聚合,然后将所有96个输入的频率相加并打印出一个SINGLE输出文件,而不是输出96个单独的文件。有人可以帮忙吗?
谢谢, ķ
答案 0 :(得分:1)
您将需要执行以下功能。这将获取一个目录中的所有.csv文件,因此该目录必须只包含您要在其中分析的文件。
myFun <- function(out.file = "mydata") {
files <- list.files(pattern = "\\.(csv|CSV)$")
# Use this next line if you are going use the file name as a variable/output etc
files.noext <- substr(basename(files), 1, nchar(basename(files)) - 4)
for (i in 1:length(files)) {
temp <- read.csv(files[i], header = FALSE)
# YOUR CODE HERE
# Use the code you have already written but operate on files[i] or temp
# Save the important stuff into one data frame that grows
# Think carefully ahead of time what structure makes the most sense
}
datafile <- paste(out.file, ".csv", sep = "")
write.csv(yourDataFrame, file = datafile)
}