作为属性一致性分析的一部分,我需要找出操作符(x,y,z)在多少情况下彼此完全一致。假设我的数据集看起来像这样。
library(data.table)
DT <- data.table(x = c("Good","Average","Bad"), y = c("Good","Average","Bad"), z = c("Average","Average","Bad"))
> DT
x y z
1: Good Good Average
2: Average Average Average
3: Poor Poor Poor
4: Poor Average Good
对于每一行,我想知道x,y和z列中的字符串是否相等。 并将结果打印在新列中。 如果所有列均相等,则应返回一。 如果一列或多列具有不同的值,则应返回零。
x y z all.equal
1: Good Good Average 0
2: Average Average Average 1
3: Poor Poor Poor 1
4: Poor Average Good 0
我已经成功地检查了两列是否相等
vgrepl <- Vectorize(grepl)
DT[, all.equal:= as.integer(vgrepl(x, y))]
但是我不能让它用于两列以上。
非常感谢您!
答案 0 :(得分:0)
此方法检查每行中是否有1个或多个唯一值:
library(data.table)
DT <- data.table(x = c("Good","Average","Bad"),
y = c("Good","Average","Bad"),
z = c("Average","Average","Bad"))
DT[, all.equal:= as.numeric(length(unique(c(x,y,z))) == 1), by=seq_len(nrow(DT))]
DT
# x y z all.equal
# 1: Good Good Average 0
# 2: Average Average Average 1
# 3: Bad Bad Bad 1
答案 1 :(得分:0)
cols <- c("x", "y", "z")
all_same <- function(x) as.integer(all(x[1] == x[-1]))
DT[, all.equal := apply(.SD, 1, all_same), .SDcols = cols]
# x y z all.equal
# 1: Good Good Average 0
# 2: Average Average Average 1
# 3: Bad Bad Bad 1