根据另一列R中的条件创建一个新列

时间:2018-08-24 18:22:48

标签: r datatable multiple-columns

我想根据日期日历创建一个新的列“ test”。 如果duration = n,我将temp test = temp的值设置为与日期前n-1天相对应的值

  Ozone Solar.R Wind Temp Month Day duration Year      Dates  MaxWind MaxDates
 41     190  7.4   67     5   1        1 1900 1900-05-01     7.4  1900-05-01  
 36     118  8.0   72     5   2       NA 1900 1900-05-02      NA          NA
 12     149 12.6   74     5   3        0 1900 1900-05-03      NA          NA
 18     313 11.5   62     5   4        2 1900 1900-05-04     14.3 1900-05-05
 NA      NA 14.3   56     5   5        0 1900 1900-05-05      NA          NA
 28      NA 14.9   66     5   6        0 1900 1900-05-06      NA          NA 
 23     299  8.6   65     5   7        3 1900 1900-05-07     20.1 1900-05-09
 19      99 13.8   59     5   8        0 1900 1900-05-08      NA          NA
  8      19 20.1   61     5   9       NA 1900 1900-05-09      NA          NA   

样本数据-

structure(list(ozone = c(41, 36, 12, 18, NA, 28, 23, 19, 8), 
    Solar.R = c(190, 118, 149, 313, NA, NA, 299, 99, 19), Wind = c(7.4, 
    8, 12.6, 11.5, 14.3, 14.9, 8.6, 13.8, 20.1), Temp = c(67, 
    72, 74, 62, 56, 66, 65, 59, 61), Month = c(5, 5, 5, 5, 5, 
    5, 5, 5, 5), Day = 1:9, duration = c(1, NA, 0, 2, 0, 0, 3, 
    0, NA), Year = c(1900, 1900, 1900, 1900, 1900, 1900, 1900, 
    1900, 1900), Dates = structure(1:9, .Label = c("1900-05-01", 
    "1900-05-02", "1900-05-03", "1900-05-04", "1900-05-05", "1900-05-06", 
    "1900-05-07", "1900-05-08", "1900-05-09"), class = "factor"), 
    MaxWind = c(7.4, NA, NA, 14.3, NA, NA, 20.1, NA, NA), MaxDates = c(1894, 
    NA, NA, 1890, NA, NA, 1886, NA, NA)), class = "data.frame", row.names = c(NA, 
-9L))

1 个答案:

答案 0 :(得分:1)

如果我对问题的理解正确,以下代码将满足您的需求。

colTest <- function(DF, n, col = "test"){
    if(is.null(DF[[col]])) DF[[col]] <- NA
    i <- which(DF[["duration"]] == n)
    j <- which(DF[["Dates"]] %in% (DF[["Dates"]][i] - (n - 1)))
    DF[[col]][j] <- DF[["Temp"]][i]
    DF
}

colTest(df1, 0)
colTest(df1, 1)
colTest(df1, 2)
colTest(df1, 3)

请注意,它也适用于n = 0

数据。

df1 <- read.table(text = "
  Ozone Solar.R Wind Temp Month Day duration Year      Dates  MaxWind MaxDates
 41     190  7.4   67     5   1        1 1900 1900-05-01     7.4  1900-05-01  
 36     118  8.0   72     5   2       NA 1900 1900-05-02      NA          NA
 12     149 12.6   74     5   3        0 1900 1900-05-03      NA          NA
 18     313 11.5   62     5   4        2 1900 1900-05-04     14.3 1900-05-05
 NA      NA 14.3   56     5   5        0 1900 1900-05-05      NA          NA
 28      NA 14.9   66     5   6        0 1900 1900-05-06      NA          NA 
 23     299  8.6   65     5   7        3 1900 1900-05-07     20.1 1900-05-09
 19      99 13.8   59     5   8        0 1900 1900-05-08      NA          NA
  8      19 20.1   61     5   9       NA 1900 1900-05-09      NA          NA
", header = TRUE)

df1$Dates <- as.Date(df1$Dates)
df1$MaxDates <- as.Date(df1$MaxDates)
相关问题