D = c("d", "d", "S", "d")
A = c("d", "a", "a", "x")
X = c("v", "x", "x", "t")
R = c("t", "r", "r", "r")
dat = data.frame(D, A, X, R)
D A X R MajoritySum
d d v t 1
d a x r 4
s a x r 3
d x t r 2
我目前正在尝试添加上图所示的MajoritySum列,该列计算行的值在大多数因子水平变量中的次数。
我遍历数据框以获取每一列的多数类,但现在遇到了困难。
majority = rep(NA, 4)
for(i in c(1:4)){
majority[i] =
names(sort(table(dat[,i]),decreasing = TRUE)[1])
}
> majority
[1] "d" "a" "x" "r"
答案 0 :(得分:1)
这是一个基本的R解决方案:
for (i in 1:nrow(MH.factors)) {
MH.factors$MajoritySum[i] <- sum(MH.factors[i,] == majority)
}
答案 1 :(得分:0)
这是使用dplyr
的幼稚解决方案:
library(dplyr)
# alternatively you can use if_else(D == majority[1], 1, 0) and so on
dat %>%
mutate(
MajoritySum = if_else(D == "d", 1, 0) +
if_else(A == "a", 1, 0) +
if_else(X == "x", 1, 0) +
if_else(R == "r", 1, 0)
)
# D A X R MajoritySum
# 1 d d v t 1
# 2 d a x r 4
# 3 S a x r 3
# 4 d x t r 2