我有一个数据框(数据),由639个数据组成,有6列。每个单元表示以秒为单位的时间我计算了每列的阈值。
到目前为止,我已经这样做了:每列的计算阈值。因此6列的6个阈值
threshold1
[1] 16 22 31 6 11 13
threshold2
[1] 200.0 275.0 387.5 75.0 137.5 162.5
此阈值表示每列的最小和最大秒数。所以我想(对所有列执行此操作):在第1列中突出显示值低于16秒的所有单元格以及值大于200秒的所有单元格。
我已经通过以下方式做到了这一点:
column1<-ifelse(data$column1<threshold1[1],"speeder",
ifelse(data$column1>threshold2[1], "slower",1))
column2<-ifelse(data$column2<threshold1[2],"speeder",
ifelse(data$column2>threshold2[2], "slower",1))
column3<-ifelse(data$column3<threshold1[3],"speeder",
ifelse(data$column3>threshold2[3], "slower",1))
所有6列的等等。
现在我想在循环中编写它,所以我不需要每次都手动编写函数ifelse
,因为我有不同的数据集,包含不同数量的列。
答案 0 :(得分:1)
首先生成名为&#34; dat&#34;:
的数据dat <- data.frame(
column1 = runif(n = 638, min=0, max=220),
column2 = runif(n = 638, min=0, max=300),
column3 = runif(n = 638, min=0, max=400),
column4 = runif(n = 638, min=0, max=100),
column5 = runif(n = 638, min=0, max=150),
column6 = runif(n = 638, min=0, max=200))
# define thresholds
threshold1 <- c(16, 22, 31, 6, 11, 13)
threshold2 <- c(200.0, 275.0, 387.5, 75.0, 137.5, 162.5)
使用循环
# Declare a list that will contain the results
results <- list()
# Loop over the columns
for(i in seq_len(ncol(dat))) {
results[[colnames(dat)[i]]] <- ifelse(dat[,i] < threshold1[i],
yes = "speeder",
no = ifelse(dat[,i] > threshold2[i],
yes = "slower", no = 1))
}
使用lapply
您也可以使用lapply而不是循环,如下所示:
results <- lapply(1:ncol(dat), function(x) {
ifelse(dat[,x] < threshold1[x],
yes = "speeder",
no = ifelse(dat[,x] > threshold2[x],
yes = "slower", no = 1))
})
names(results) <- colnames(dat)
<强>结果
您可以使用results[[1]]
至results[[6]]
或results$column1
至results$column6
> head(results$column1, 100)
[1] "1" "1" "1" "1" "1" "1" "slower"
[8] "1" "slower" "1" "1" "1" "speeder" "1"
[15] "1" "1" "1" "1" "1" "1" "1"
[22] "slower" "1" "1" "1" "1" "1" "1"
[29] "1" "1" "1" "slower" "1" "slower" "slower"
[36] "1" "1" "1" "1" "speeder" "1" "1"
[43] "1" "1" "speeder" "speeder" "1" "1" "slower"
[50] "1" "1" "slower" "1" "1" "1" "1"
[57] "1" "1" "1" "1" "1" "1" "1"
[64] "1" "1" "1" "1" "slower" "1" "1"
[71] "slower" "1" "1" "1" "speeder" "1" "1"
[78] "1" "1" "1" "1" "slower" "1" "1"
[85] "1" "1" "1" "1" "1" "1" "1"
[92] "1" "1" "1" "1" "1" "1" "1"
[99] "speeder" "1"
答案 1 :(得分:0)
你也可以试试lapply ..它会比循环更快..
dat <- data.frame(
column1 = runif(n = 638, min=0, max=220),
column2 = runif(n = 638, min=0, max=300),
column3 = runif(n = 638, min=0, max=400),
column4 = runif(n = 638, min=0, max=100),
column5 = runif(n = 638, min=0, max=150),
column6 = runif(n = 638, min=0, max=200))
# define thresholds
threshold1 <- c(16, 22, 31, 6, 11, 13)
threshold2 <- c(200.0, 275.0, 387.5, 75.0, 137.5, 162.5)
result = matrix(unlist(lapply(seq(6), function(i){
ifelse(dat[,i] < threshold1[i],
yes = "speeder",
no = ifelse(dat[,i] > threshold2[i],
yes = "slower", no = 1))
})), ncol = 6, byrow = FALSE)
head(result)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "speeder" "1" "1" "slower" "1" "1"
[2,] "1" "1" "1" "1" "1" "1"
[3,] "1" "1" "1" "1" "1" "1"
[4,] "1" "1" "1" "slower" "1" "1"
[5,] "1" "1" "1" "1" "1" "1"
[6,] "1" "1" "1" "slower" "1" "1"