我有5个向量,这些向量中的每个项目都是“是”或“否” 所以我想比较这5个向量(逐行)并计算每行的多数投票,并将结果添加到新的向量。 如何以有效的方式执行此操作。
v1=c("yes","no","no","yes")
v2=c("no","no","yes","yes")
v3=c("yes","yes","no","yes")
v4=c("yes","no","yes","yes")
v5=c("yes","yes","yes","no")
#The expected output is "yes", "no", "yes", "yes"
答案 0 :(得分:3)
首先将数据放在基于字符的表格中:
dat <- data.frame( v1=c("yes","no","no","yes"),
v2=c("no","no","yes","yes"),
v3=c("yes","yes","no","yes"),
v4=c("yes","no","yes","yes"),
v5=c("yes","yes","yes","no"), stringsAsFactors=FALSE)
然后拉出表对象的最大值名称:
apply(dat, 1, function(x) names(which.max(table(x))) )
[1] "yes" "no" "yes" "yes"
答案 1 :(得分:0)
另一种方法是使用带有mapply
的{{1}}来返回TRUE和FALSE矩阵,比较向量的元素等于某个位置(此处为“是”)。然后==
计算跨行的比例,rowMeans
检查多数。我们添加1以转换为数字位置,然后将其用作从> 0.5
中的元素中进行选择的位置。
c("no", "yes")
使用矩阵乘法的替代方法是
c("no", "yes")[(rowMeans(mapply("==", moreArgs=list("yes"), myList)) > 0.5) + 1L]
[1] "yes" "no" "yes" "yes"
请注意,首先将矢量放在列表中,如下所示。
数据强>
c("no", "yes")[((do.call(cbind, myList) == "yes") %*%
rep(1, length(myList)) > (length(myList) / 2)) + 1L]
[1] "yes" "no" "yes" "yes"