R从矩阵中提取具有不同值的行

时间:2015-02-02 14:07:14

标签: r matrix comparison extraction

我有两个矩阵m1和m2具有相同的结构,现在我需要创建2个新的矩阵
1)第一个(称为Partenza)具有m1所有与m2不同的行 2)第二个(calle Arrivo)具有来自m2的相同行

m1 
    row_num  datoA        datoB
    1          p           f
    3          h           b
    5          c           m
    6          c           r
    9          m           f
    14         a           b

m2 
    row_num  datoA        datoB
    1          p           f
    3          h           b
    5          c           g
    6          a           r
    9          m           f
    14         x           j

我的结果应该是:

Partenza (taken from m1)
    row_num  datoA        datoB
    5          c           m
    6          c           r
    14         a           b

Arrivo (taken from m2)
    row_num  datoA        datoB
    5          c           g
    6          a           r
    14         x           j

我试过

zzz <- setdiff(m1,m2)  
partenza<-m1[m1[,"ROW_NUM"] %in% zzz,]
arrivo<-  m1[m1[,"ROW_NUM"] %in% zzz,]  

但它无法告诉我zzz总是空的(我相信它应该不是!)

2 个答案:

答案 0 :(得分:1)

使用dplyr包中的anti_join功能:

package(dplyr)

m1 <- data.frame(
  row_num = c(1,3,5,6,9,14),
  datoA = c("p","h","c","c","m","a"),
  datoB = c("f","b","m","r","f","b")
  )

m2 <- data.frame(
  row_num = c(1,3,5,6,9,14),
  datoA = c("p","h","c","a","m","x"),
  datoB = c("f","b","g","r","f","j")
)

Partenza <- anti_join(m1,m2) %>% arrange(row_num)
Arrivo <- anti_join(m2,m1) %>% arrange(row_num)

答案 1 :(得分:1)

base R(不需要包裹)中,您可以执行以下操作:

diffrow<-sapply(1:nrow(m1),function(x) !all(m1[x,]==m2[x,]))
partenza<-m1[diffrow,]
arrivo<-m2[diffrow,]

另一个选项,使用包setdiff中的dplyr函数:

library(dplyr)
partenza<-setdiff(m1,m2)
arrivo<-setdiff(m2,m1)

在这两种情况下,你都会得到:

> partenza
#  row_num datoA datoB
#3       5     c     m
#4       6     c     r
#6      14     a     b
> arrivo
#  row_num datoA datoB
#3       5     c     g
#4       6     a     r
#6      14     x     j

数据:

m1<-structure(list(row_num = c(1L, 3L, 5L, 6L, 9L, 14L), datoA = c("p", 
"h", "c", "c", "m", "a"), datoB = c("f", "b", "m", "r", "f", 
"b")), .Names = c("row_num", "datoA", "datoB"), class = "data.frame", row.names = c(NA, 
-6L))
m2<-structure(list(row_num = c(1L, 3L, 5L, 6L, 9L, 14L), datoA = c("p", 
"h", "c", "a", "m", "x"), datoB = c("f", "b", "g", "r", "f", 
"j")), .Names = c("row_num", "datoA", "datoB"), class = "data.frame", row.names = c(NA, 
-6L))