我有一些字符列作为数据框df:
V1 V2 V3 group
B C - 1
B C C 1
B C C 1
A C A 2
A A A 2
A A A 2
我想知道每列的因子组的交集是否为空,并希望以TRUE / FALSE格式输出结果。
第2列是唯一具有非零交集的列,我使用以下方法检查:
> is.na(intersect(df[,2][df$group=="1"],df[,2][df$group=="2"]))
[1] FALSE
我试图使用
为三列V1-V3自动执行此操作by(df[,1:3], df$group, function(x) { is.na(intersect(x[df$group=="1"],x[df$group=="2"]))})
但收到了错误:
Error in `[.data.frame`(x, df$group == "2") : undefined columns selected
感谢您提出任何建议/替代方案!
答案 0 :(得分:1)
尝试
lapply(df[,1:3], function(x)
is.na(intersect(x[df$group=='1'], x[df$group=='2'])))
或者
Map(function(x,y) is.na(intersect(x,y)),
df[df$group=='1',-4], df[df$group=='2', -4])
如果你有很多groups
,
lapply(df[,1:3], function(x) is.na(Reduce(`intersect`,split(x, df$group))))
df <- structure(list(V1 = c("B", "B", "B", "A", "A", "A"), V2 = c("C",
"C", "C", "C", "A", "A"), V3 = c("-", "C", "C", "A", "A", "A"
), group = c(1L, 1L, 1L, 2L, 2L, 2L)), .Names = c("V1", "V2",
"V3", "group"), class = "data.frame", row.names = c(NA, -6L))