R中三个变量的一致性并绘制数据

时间:2016-01-23 13:37:09

标签: r

我有一个名为mydf的数据框。有三组列表示为apporapin。我想匹配或比较所有列值与app vs ora,ora vs pin和pin vs app列,并获得一致性/匹配统计信息。我还希望获得三个变量之间的整体一致性,并制作表示数据的图表。 R中最好的方法是什么?

 mydf<-structure(c("0/0", "0/1", "0/0", "0/0", "0/0", "0/0", "0/0", 
                      "0/0", "0/1", "0/0", "0/1", "0/0", "0/0", "0/0", "0/0", "0/0", 
                      "0/0", "0/1"), .Dim = c(3L, 6L), .Dimnames = list(c("1", "2", 
                                                                          "4"), c("app:x", "ora:x", "pin:x", "app:y", "ora:y", "pin:y")))

1 个答案:

答案 0 :(得分:2)

嗯,这是作为入门者的一种方法(可能有很多优化空间,因为我不熟悉data.table包):

library(splitstackshape)
dt <- cSplit(melt(cSplit(mydf, 1:6, "/")[, rowname:=rownames(mydf)], id.vars = c("rowname")), 2, ":")[]
setkey(dt, rowname, variable_2)
dt <- dt[dt, allow.cartesian=TRUE][variable_1!=i.variable_1]
idx <- which(!duplicated(cbind(dt$rowname,dt$variable_2, t(apply(dt[, .(variable_1, i.variable_1)], 1, function(x) sort(x))))))
dt <- dt[idx, .(rowname, variable_2, variable_1, i.variable_1, isEqual=value==i.value)]
dt
#     rowname variable_2 variable_1 i.variable_1 isEqual
#  1:       1        x_1        ora          app    TRUE
#  2:       1        x_1        pin          app    TRUE
#  3:       1        x_1        pin          ora    TRUE
#  4:       1        x_2        ora          app    TRUE
#  5:       1        x_2        pin          app    TRUE
#  6:       1        x_2        pin          ora    TRUE
#  7:       1        y_1        ora          app    TRUE
#  8:       1        y_1        pin          app    TRUE
#  9:       1        y_1        pin          ora    TRUE
# 10:       1        y_2        ora          app    TRUE
# 11:       1        y_2        pin          app    TRUE
# 12:       1        y_2        pin          ora    TRUE
# 13:       2        x_1        ora          app    TRUE
# 14:       2        x_1        pin          app    TRUE
# 15:       2        x_1        pin          ora    TRUE
# 16:       2        x_2        ora          app   FALSE
# 17:       2        x_2        pin          app   FALSE
# ...

library(ggplot2)
ggplot(dt, aes(variable_1, i.variable_1, fill=isEqual)) +
  geom_tile() + 
  facet_grid(rowname~variable_2)

enter image description here