我知道答案很简单,但到目前为止我无法理解。我也尝试通过类似的问题找到答案,但我不能。无论如何,我需要返回具有向量(ID
)的所有元素的矩阵m
NoN
。在我下面准备的示例中,我需要返回ID 1和3。
示例:
m<-matrix(c(1,1,1,1,2,2,34,45,4,4,4,4,4,5,6,3,3,3,3,21,22,3425,345,65,22,42,65,86,456,454,5678,5,234,22,65,21,22,786),nrow=19)
colnames(m)<-c("ID","LO")
NoN<-c(21,22)
到目前为止我的尝试如下:
1: m[all(m[,2] %in% NoN),1]
2: m[match(NoN, m[,2]),1]
3: subset(m, m[,2] %in% NoN)
4: m[which(m[,2] %in% NoN),1]
欣赏!
答案 0 :(得分:3)
这是一个使用基础R
的函数:
FOO <- function(m, NoN){
# split matrix based on ID column
m2 <- lapply(split(m, m[, 1]), function(x) matrix(x, ncol = 2))
# match every element of NoN, create logical matrix
matchresult <- do.call(cbind, lapply(lapply(m2, function(x) lapply(NoN, function(y) match(y, x[,2]))), unlist))
# print colnames (= ID) of columns with no NA
as.numeric(colnames(matchresult)[colSums(apply(matchresult, 2, is.na)) == 0])
}
函数调用的结果:
> FOO(m, NoN)
[1] 1 3
除了您的示例之外未经测试,但这应该能够处理任意长度的NoN
以及ID
和LO
的重复组合。
编辑:@docendodiscimus提供的更简洁有效的变体:
FOO <- function(m, NoN){
df <- as.data.frame(m)
unique(df[as.logical(ave(df$LO, df$ID, FUN = function(x) all(NoN %in% x))),"ID"])
}
答案 1 :(得分:2)
m<-matrix(c(1,1,1,1,2,2,34,45,4,4,4,4,4,5,6,3,3,3,3,21,22,3425,345,65,22,42,65,86,456,454,5678,5,234,22,65,21,22,786),nrow=19)
colnames(m)<-c("ID","LO")
NoN<-c(21,22)
IDs <- m[m[, 2] %in% NoN, 1]
IDs <- table(IDs)
IDs <- names(IDs)[IDs >= length(NoN)]
> IDs
[1] "1" "3"
但要注意,这不会考虑重复的值。因此,如果ID 1有两个值为21但没有22的LO,它仍将返回ID 1。
library(dplyr)
m <- data.frame(m)
IDs <- m %>%
slice(which(LO %in% NoN)) %>% # get all rows which contain values from NoN
group_by(ID) %>% # group by ID
summarise(uniques = n_distinct(LO)) %>% # count unique values per ID
filter(uniques == length(NoN)) %>% # number of unique values has to be the same as the number of values in NoN
select(ID) %>% # select ID columns
unlist() %>% # unlist it
as.numeric() # convert from named num to numeric
> IDs
[1] 1 3
答案 2 :(得分:0)
这是一种替代解决方案,可将矩阵m
保存为数据框,并为每个ID
执行处理:
# example data
m<-matrix(c(1,1,1,1,2,2,34,45,4,4,4,4,4,5,6,3,3,3,3,21,22,3425,345,65,22,42,65,86,456,454,5678,5,234,22,65,21,22,786),nrow=19)
colnames(m)<-c("ID","LO")
NoN<-c(21,22)
library(dplyr)
data.frame(m) %>% # save m as dataframe
group_by(ID) %>% # for each ID
summarise(sum_flag = sum(LO %in% NoN)) %>% # count number of LO elements in NoN
filter(sum_flag == length(NoN)) %>% # keep rows where this number matches the length of NoN
pull(ID) # get the corresponding IDs
# [1] 1 3
请注意,此流程假设(基于您的示例)NoN
和m
行的元素是唯一的。
答案 3 :(得分:0)
我从@ docendo discimus 中得到了这个答案,我发现它有效而简洁。
df <- as.data.frame(m);
unique(df[as.logical(ave(df$LO, df$ID, FUN = function(x) all(NoN %in% x))),"ID"])