我想将IF语句应用于多个列(实际上是整个数据帧),并采用创建函数的方法。目的是用表示该数字落入的组的数字替换列中的数据。
数据样本如下:
> Mat
A B C D E
E1 8.45 6.65 7.35 5.18 3.11
E2 12.59 4.18 4.08 0.95 1.75
E3 15.93 3.05 1.81 2.77 4.42
E4 15.93 3.05 1.81 2.77 4.42
E5 11.57 4.48 4.70 2.01 1.08
E6 8.17 7.05 7.70 5.38 3.45
E7 11.57 4.48 4.70 2.01 1.08
E8 9.49 5.41 6.51 5.78 3.20
E9 11.71 4.40 4.58 1.87 1.11
E10 9.52 5.49 6.63 6.07 3.49
我尝试创建的函数将采用IF语句并查看列中的每个值,并根据值将其替换为1到6的组号(对于1到10之间的数字)和{{1对于大于10的数字,当我手动为一列写出时,IF语句本身有效。我写的函数是这样的(称为分组):
NA
当我尝试使用# write user function to apply the loop
Grouping = function(data) {
for(i in 1:length(x)) {
if(x[i] < 1) {
x[i] = 1
} else if (x[i] < 3) {
x[i] = 3
} else if (x[i] < 4) {
x[i] = 4
} else if (x[i] < 5) {
x[i] = 5
} else if (x[i] < 10) {
x[i] = 6
} else
x[i] = "NA"
}
}
函数时,我的错误是:
apply
显然问题在于我构建用户功能,但我不确定我在哪里出错,因为我对功能创建很新。
感谢任何帮助!
谢谢!
答案 0 :(得分:2)
在处理矢量时,你应该使用ifelse
,而不是循环。
grouping <- function(x)
{
ifelse(x < 1, 1,
ifelse(x < 3, 3,
ifelse(x < 4, 4,
ifelse(x < 5, 5,
ifelse(x < 10, 6,
NA)))))
}
data[] <- lapply(data, grouping)
或者更好的是,使用cut
将数字向量转换为波段:
grouping <- function(x)
{
x <- cut(x, c(-Inf, 1, 3, 4, 5, 10), labels=c(1, 3, 4, 5, 6), right=FALSE)
as.numeric(as.character(x))
}
data[] <- lapply(data, grouping)
答案 1 :(得分:1)
这是一种方法,只需将数据更改为x;
Grouping = function(x) {
if(x < 1) {
x = 1
} else if (x < 3) {
x = 3
} else if (x < 4) {
x = 4
} else if (x < 5) {
x = 5
} else if (x < 10) {
x = 6
} else
x = "NA"
}
虚拟数据
> set.seed(1)
> mat<-matrix(rnorm(100,5,5), nrow=10)
> mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1.8677309 12.558906 9.594887 11.793398 4.177382 6.9905294 17.008089 7.3775476 2.156656 2.287400
[2,] 5.9182166 6.949216 8.910682 4.486061 3.733192 1.9398680 4.803800 1.4502678 4.324107 11.039339
[3,] 0.8218569 1.893797 5.372825 6.938358 8.484817 6.7055985 8.448697 8.0536318 10.890435 10.802013
[4,] 12.9764040 -6.073499 -4.946758 4.730975 7.783316 -0.6468155 5.140011 0.3295118 -2.617834 8.501068
[5,] 6.6475389 10.624655 8.099129 -1.885298 1.556222 12.1651185 1.283634 -1.2681670 7.969731 12.934167
[6,] 0.8976581 4.775332 4.719356 2.925027 1.462524 14.9019995 5.943961 6.4572312 6.664752 7.792432
[7,] 7.4371453 4.919049 4.221022 3.028550 6.822910 3.1638926 -4.024793 2.7835406 10.315499 -1.382961
[8,] 8.6916235 9.719181 -2.353762 4.703433 8.842665 -0.2206731 12.327774 5.0055268 3.479080 2.133673
[9,] 7.8789068 9.106106 2.609250 10.500127 4.438269 7.8485981 5.766267 5.3717066 6.850094 -1.123063
[10,] 3.4730581 7.969507 7.089708 8.815879 9.405539 4.3247270 15.863058 2.0523953 6.335494 2.632997
应用功能
> matrix(lapply(mat, Grouping), nrow = 10)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 3 "NA" 6 "NA" 5 6 "NA" 6 3 3
[2,] 6 6 6 5 4 3 5 3 5 "NA"
[3,] 1 3 6 6 6 6 6 6 "NA" "NA"
[4,] "NA" 1 1 5 6 1 6 1 1 6
[5,] 6 "NA" 6 1 3 "NA" 3 1 6 "NA"
[6,] 1 5 5 3 3 "NA" 6 6 6 6
[7,] 6 5 5 4 6 4 1 3 "NA" 1
[8,] 6 6 1 5 6 1 "NA" 6 4 3
[9,] 6 6 3 "NA" 5 6 6 6 6 1
[10,] 4 6 6 6 6 5 "NA" 3 6 3