在r中循环运行脚本

时间:2018-08-03 14:38:28

标签: r tidyverse

我需要为每个工作站运行一个脚本(我将脚本中的数字1替换为1),但是有超过100个工作站。

我认为也许在脚本中循环可以节省我的时间。从来没有做过循环,不知道是否有可能做我想做的事。我已经尝试过,但是没有用。

仅是我的df8数据(txt):

RowNum,date,code,gauging_station,precp
1,01/01/2008 01:00,1586,315,0.4
2,01/01/2008 01:00,10990,16589,0.2
3,01/01/2008 01:00,17221,30523,0.6
4,01/01/2008 01:00,34592,17344,0
5,01/01/2008 01:00,38131,373,0
6,01/01/2008 01:00,44287,370,0
7,01/01/2008 01:00,53903,17314,0.4
8,01/01/2008 01:00,56005,16596,0
9,01/01/2008 01:00,56349,342,0
10,01/01/2008 01:00,57294,346,0
11,01/01/2008 01:00,64423,533,0
12,01/01/2008 01:00,75266,513,0
13,01/01/2008 01:00,96514,19187,0

代码:

station <- sample(50:150,53,replace=F)

        for(i in station) 
          {

        df08_1 <- filter(df08, V7==station [i])

        colnames(df08_1) <- c("Date","gauging_station", "code", "precp")


        df08_1 <- unique(df08_1)


        final <- df08_1 %>%
          group_by(Date=floor_date(Date, "1 hour"), gauging_station, code) %>%
          summarize(precp=sum(precp))


        write.csv(final,file="../station [i].csv", row.names = FALSE)

    }

2 个答案:

答案 0 :(得分:2)

如果您不反对使用某些tidyverse软件包,我想您可以简化一下:

已使用新的示例数据进行更新-在我的计算机上可以正常运行:

代码:

library(dplyr)

dat %>%
  select(-RowNum) %>%
  distinct() %>% 
  group_by(date_hour = lubridate::floor_date(date, 'hour'), gauging_station, code) %>%
  summarize(precp = sum(precp)) %>%
  split(.$gauging_station) %>%
  purrr::map(~write.csv(.x,
                        file = paste0('../',.x$gauging_station, '.csv'),
                        row.names = FALSE))

数据:

dat <- data.table::fread("RowNum,date,code,gauging_station,precp
                  1,01/01/2008 01:00,1586,315,0.4
                  2,01/01/2008 01:00,10990,16589,0.2
                  3,01/01/2008 01:00,17221,30523,0.6
                  4,01/01/2008 01:00,34592,17344,0
                  5,01/01/2008 01:00,38131,373,0
                  6,01/01/2008 01:00,44287,370,0
                  7,01/01/2008 01:00,53903,17314,0.4
                  8,01/01/2008 01:00,56005,16596,0
                  9,01/01/2008 01:00,56349,342,0
                  10,01/01/2008 01:00,57294,346,0
                  11,01/01/2008 01:00,64423,533,0
                  12,01/01/2008 01:00,75266,513,0
                  13,01/01/2008 01:00,96514,19187,0") %>%
  mutate(date = as.POSIXct(date, format = '%m/%d/%Y %H:%M'))

答案 1 :(得分:0)

不能为缺少信誉而发表评论,但是如果您将站点[i]更改为站点编号,则该代码可以正常工作,那么听起来每个站点都是df08对象的一部分,必须从df08对象中提取(数据框)。

如果我对您的理解正确,我将按照以下步骤进行操作:

stations <- c(1:100) #put your station IDs into a vector

for(i in stations) { #run the script for each entry in the list

  #assuming that 'V7' is the name of the (unnamed) seventh column of df08, it could
  #work like this:
  df08_1 <- filter(df08, df08$V7==i)    #if your station names are something like 
  #'station 1' as a string, use paste("station", 1, sep = "")

  colnames(df08_1) <- c("Date","gauging_station", "code", "precp")

  df08_1 <- unique(df08_1)


  final <- df08_1 %>%
    group_by(Date=floor_date(Date, "1 hour"), gauging_station, code) %>% 
    summarize(precp=sum(precp))     #floor_date here is probably your own function


  write.csv(final,file=paste("../station", i, ".csv", sep=""), row.names = FALSE) 
  #automatically generate names. You can modify the string to whatever you want ofc.

}

如果此方法和所有其他示例均不起作用,是否可以向我们提供一些虚拟数据以供使用,只是看看df08数据帧是什么样子?还有floor_date()函数做什么?