返回包含向量所有元素的矩阵的ID

时间:2017-12-20 12:49:44

标签: r

我知道答案很简单,但到目前为止我无法理解。我也尝试通过类似的问题找到答案,但我不能。无论如何,我需要返回具有向量(ID)的所有元素的矩阵m NoN。在我下面准备的示例中,我需要返回ID 1和3。

示例:

m<-matrix(c(1,1,1,1,2,2,34,45,4,4,4,4,4,5,6,3,3,3,3,21,22,3425,345,65,22,42,65,86,456,454,5678,5,234,22,65,21,22,786),nrow=19)
colnames(m)<-c("ID","LO")
NoN<-c(21,22)

到目前为止我的尝试如下:

1: m[all(m[,2] %in% NoN),1]
2: m[match(NoN, m[,2]),1]
3: subset(m, m[,2] %in% NoN)
4: m[which(m[,2] %in% NoN),1]

欣赏!

4 个答案:

答案 0 :(得分:3)

这是一个使用基础R的函数:

FOO <- function(m, NoN){
  # split matrix based on ID column
  m2 <- lapply(split(m, m[, 1]), function(x) matrix(x, ncol = 2)) 
  # match every element of NoN, create logical matrix
  matchresult <- do.call(cbind, lapply(lapply(m2, function(x) lapply(NoN, function(y) match(y, x[,2]))), unlist))
  # print colnames (= ID) of columns with no NA
  as.numeric(colnames(matchresult)[colSums(apply(matchresult, 2, is.na)) == 0])
}

函数调用的结果:

> FOO(m, NoN)
[1] 1 3

除了您的示例之外未经测试,但这应该能够处理任意长度的NoN以及IDLO的重复组合。

编辑:@docendodiscimus提供的更简洁有效的变体:

FOO <- function(m, NoN){
  df <- as.data.frame(m) 
  unique(df[as.logical(ave(df$LO, df$ID, FUN = function(x) all(NoN %in% x))),"ID"])
}

答案 1 :(得分:2)

使用基础R的不太安全的方式:

m<-matrix(c(1,1,1,1,2,2,34,45,4,4,4,4,4,5,6,3,3,3,3,21,22,3425,345,65,22,42,65,86,456,454,5678,5,234,22,65,21,22,786),nrow=19)
colnames(m)<-c("ID","LO")
NoN<-c(21,22)

IDs <- m[m[, 2] %in% NoN, 1]
IDs <- table(IDs)

IDs <- names(IDs)[IDs >= length(NoN)]

> IDs
[1] "1" "3"

但要注意,这不会考虑重复的值。因此,如果ID 1有两个值为21但没有22的LO,它仍将返回ID 1。

编辑:使用dplyr的安全方式:

library(dplyr)

m <- data.frame(m)

IDs <- m %>% 
  slice(which(LO %in% NoN)) %>%             # get all rows which contain values from NoN
  group_by(ID) %>%                          # group by ID
  summarise(uniques = n_distinct(LO)) %>%   # count unique values per ID
  filter(uniques == length(NoN)) %>%        # number of unique values has to be the same as the number of values in NoN
  select(ID) %>%                            # select ID columns
  unlist() %>%                              # unlist it
  as.numeric()                              # convert from named num to numeric

> IDs
[1] 1 3

答案 2 :(得分:0)

这是一种替代解决方案,可将矩阵m保存为数据框,并为每个ID执行处理:

# example data
m<-matrix(c(1,1,1,1,2,2,34,45,4,4,4,4,4,5,6,3,3,3,3,21,22,3425,345,65,22,42,65,86,456,454,5678,5,234,22,65,21,22,786),nrow=19)
colnames(m)<-c("ID","LO")
NoN<-c(21,22)

library(dplyr)

data.frame(m) %>%                            # save m as dataframe
  group_by(ID) %>%                           # for each ID
  summarise(sum_flag = sum(LO %in% NoN)) %>% # count number of LO elements in NoN
  filter(sum_flag == length(NoN)) %>%        # keep rows where this number matches the length of NoN
  pull(ID)                                   # get the corresponding IDs

# [1] 1 3

请注意,此流程假设(基于您的示例)NoNm行的元素是唯一的。

答案 3 :(得分:0)

我从@ docendo discimus 中得到了这个答案,我发现它有效而简洁。

df <- as.data.frame(m); 
unique(df[as.logical(ave(df$LO, df$ID, FUN = function(x) all(NoN %in% x))),"ID"])