如何使用for循环和if语句在数据框中创建新列

时间:2014-12-10 18:01:48

标签: r if-statement for-loop

我有一个包含102行的数据框,我需要使用if语句开发一个for循环,以根据其他列(Sp,Su,Fa,Wi)填充新列“Season”。我有一个“1”填充样本发生的季节(见下文)。

Sp  Su  Fa  Wi
1   0   0   0
0   0   0   1

我试着在夏天做一个循环,但是我遇到了很多错误。我似乎无法掌握For和if循环。我们将不胜感激。

for(i in 1:102) {  if(myData$Su==1) myData$Season=Summer}

错误:

In if (myData$Su == 1) myData$Season = Summer :
  the condition has length > 1 and only the first element will be used

5 个答案:

答案 0 :(得分:3)

尝试识别哪个列有1,然后使用此索引从char向量返回季节的名称:

data <- c("Sp  Su  Fa  Wi
           1   0   0   0
           0   0   0   1")
data <- read.table(text=data,header=TRUE)

data$Season <- c("Spring","Summer","Fall","Winter")[which(data==1,arr.ind=TRUE)[,"col"]]

结果:

  Sp Su Fa Wi Season
1  1  0  0  0 Spring
2  0  0  0  1 Winter

答案 1 :(得分:1)

由于R是基于矢量的语言,因此在这种情况下不需要for循环。

dat <- data.frame(
  Sp = c(1, 0),
  Su = c(0, 0),
  Fa = c(0, 0),
  Wi = c(0, 1)
)

一种天真的,蛮力的方式是使用嵌套的ifelse()函数:

dat$Season <- with(dat, 
                   ifelse(Sp == 1, "Spring", 
                          ifelse(Su == 1, "Summer", 
                                 ifelse(Fa == 1, "Fall", 
                                        "Winter"))))
dat

  Sp Su Fa Wi Season
1  1  0  0  0 Spring
2  0  0  0  1 Winter

但R方式是考虑数据的结构,然后使用索引,例如:

dat$season <- apply(dat, 1, function(x) c("Sp", "Su", "Fa", "Wi")[x==1])

  Sp Su Fa Wi season
1  1  0  0  0     Sp
2  0  0  0  1     Wi

答案 2 :(得分:0)

ifelse(myData$Su==1, myData$Season=="Summer",myData$Season=="Not Summer")

或更复杂的“否”声明(例如嵌套ifelse - 如果Wi == 1,设置为冬季等)

答案 3 :(得分:0)

如果你真的想要使用循环,你应该这样做:

# recreating an example similar to your data
myData <- read.csv(text= 
"Sp,Su,Fa,Wi
1,0,0,0
0,1,0,0
0,0,1,0
1,0,0,0
0,0,0,1")

# before the loop, add a new "Season" column to myData filled with NAs
myData$Season <- NA

# don't use 102 but nrow(myData) so
# in case myData changes you don't have to modify the code
for(i in 1:nrow(myData)){

  # here you are working row-by-row
  # so note the [i] indexing below

  if(myData$Sp[i] == 1){
    myData$Season[i] = "Spring"
  }else if(myData$Su[i] == 1){
    myData$Season[i] = "Summer"
  }else if(myData$Fa[i] == 1){
    myData$Season[i] = "Fall"
  }else if(myData$Wi[i] == 1){
    myData$Season[i] = "Winter"
  }
}

但实际上(如其他答案所示)有更有效和更快捷的方式。

答案 4 :(得分:0)

您也可以使用(@ Emer&#39的方法)

 transform(dat, Season=c('Spring', 'Summer', 'Fall',
             'Winter')[as.matrix(seq_len(ncol(dat))*dat)])
 #  Sp Su Fa Wi Season
 #1  1  0  0  0 Spring
 #2  0  0  0  1 Winter

数据

 dat <- structure(list(Sp = c(1, 0), Su = c(0, 0), Fa = c(0, 0), Wi = c(0, 
 1)), .Names = c("Sp", "Su", "Fa", "Wi"), row.names = c(NA, -2L
 ), class = "data.frame")