从data.frame中的现有变量创建几个新的类似变量

时间:2013-11-08 19:40:24

标签: r variables dataframe

我再次发现自己在R中执行重复任务,并且相信可能有更聪明或至少更短的方法来处理以下任务。我正在data.frame中为每个月和每年创建一个新变量JK.M*Y****。它是根据data.frame中的现有变量计算的,包括ifelse语句,每个月和每年都有语句。

首先,R中是否有一个默认方法用于重复性任务,例如?第二,有没有更明智的方法来做我在下面具体做的事情?

# Example Data with 2 months, 2 years and 3 variables
DF<- structure(list(ID = 1:4, ABC.M1Y2001 = c(10, 12.3, 45, 89), ABC.M2Y2001 = c(11.1, 
          34, 67.7, -15.6), ABC.M1Y2002 = c(-11.1, 9, 34, 56.5), ABC.M2Y2002 = c(12L,
          13L, 11L, 21L), DEF.M1Y2001 = c(14L, 14L, 14L, 16L), DEF.M2Y2001 = c(15L,
          15L, 15L, 12L), DEF.M1Y2002 = c(5, 12, 23.5, 34), DEF.M2Y2002 = c(6L,
          34L, 61L, 56L), GHI.M1Y2001 = c(18.3, 2.8, 9.5, 28.2), 
          GHI.M2Y2001 = c(-0.90, 21.1, 57, -36.7), GHI.M2Y2002 = c(0.52, 
          -12.2, -32.9, 21.2), GHI.M1Y2002 = c(-11, -1.7, -5.7, -17)), 
          .Names = c("ID", "ABC.M1Y2001", "ABC.M2Y2001","ABC.M1Y2002", 
          "ABC.M2Y2002", "DEF.M1Y2001", "DEF.M2Y2001", "DEF.M1Y2002", 
          "DEF.M2Y2002", "GHI.M1Y2001","GHI.M2Y2001","GHI.M1Y2002","GHI.M2Y2002"), 
          class = "data.frame", row.names = c(NA, -4L))

# 2001 create new variable "JK" for each month per year
DF$JK.M1Y2001 <- ifelse(((4 * DF$ABC.M1Y2001)+(2*DF$DEF.M1Y2001))/5 < 0,
                         DF$GHI.M1Y2001 / (.6* exp(((2*DF$DEF.M1Y2001)/(DF$DEF.M1Y2001+7)))),
                         DF$GHI.M1Y2001 / (.6* exp(((7*DF$DEF.M1Y2001)/(DF$DEF.M1Y2001+3)))))

DF$JK.M2Y2001 <- ifelse(((4 * DF$ABC.M2Y2001)+(2*DF$DEF.M2Y2001))/5 < 0,
                         DF$GHI.M2Y2001 / (.6* exp(((2*DF$DEF.M2Y2001)/(DF$DEF.M2Y2001+7)))),
                         DF$GHI.M2Y2001 / (.6* exp(((7*DF$DEF.M2Y2001)/(DF$DEF.M2Y2001+3)))))
# and so on for 2001
# ...
# 2002 create new variable "JK" for each month per year
DF$JK.M1Y2002 <- ifelse(((4 * DF$ABC.M1Y2002)+(2*DF$DEF.M1Y2002))/5 < 0,
                        DF$GHI.M1Y2002 / (.6* exp(((2*DF$DEF.M1Y2002)/(DF$DEF.M1Y2002+7)))),
                        DF$GHI.M1Y2002 / (.6* exp(((7*DF$DEF.M1Y2002)/(DF$DEF.M1Y2002+3)))))

# ...

1 个答案:

答案 0 :(得分:1)

我会在两个循环中完成:

for(month in c('M1', 'M2')){
  for(year in c('Y2001', 'Y2002')){
    new.var.name <- paste('JK.M', month, year)
    first.var.name <- paste('ABC.M', month, year)
    second.var.name <- paste('ABC.M', month, year)
    third.var.name <- paste('ABC.M', month, year)
    DF[[new.var.name]] <- ifelse(((4 * DF[[first.var.name]])+(2*DF[[second.var.name]]))/5 < 0,
                    DF[[third.var.name]] / (.6* exp(((2*DF$DEF.M1Y2002)/(DF$DEF.M1Y2002+7)))),
                    DF[[third.var.name]] / (.6* exp(((7*DF$DEF.M1Y2002)/(DF$DEF.M1Y2002+3)))))
  }
}

关键是使用paste构造变量名称并将数据框视为列表以添加新变量。

这可以改进,但我试着拼出来让你可以看到这个想法。