假设我有这样的数据集:
id <- c(1,1,1,2,2,3,3,4,4)
visit <- c("A", "B", "C", "A", "B", "A", "C", "A", "B")
test1 <- c(12,16, NA, 11, 15,NA, 0,12, 5)
test2 <- c(1,NA, 2, 2, 2,2, NA,NA, NA)
df <- data.frame(id,visit,test1,test2)
我想知道每次访问PER测试的数据点数,以便最终输出看起来像这样:
visit test1 test2
A 3 3
B 3 1
C 1 1
我知道我可以像older post中提到的那样对1个变量使用聚合函数:
aggregate(x = df$id[!is.na(df$test)], by = list(df$visit[!is.na(df$test)]), FUN = length)
但是如何进行多项测试呢?
答案 0 :(得分:2)
在基地R中使用table
和rowSums
:
cols <- 3:4
sapply(cols, function(i) rowSums(table(df$visit, df[,i]), na.rm = TRUE))
# [,1] [,2]
#A 3 3
#B 3 1
#C 1 1
答案 1 :(得分:2)
您还可以使用data.table
,这对于灵活的列数非常有用:
cols <- names(df)[grepl("test",names(df))]
setDT(df)[,lapply(.SD, function(x) sum(!is.na(x))), by = visit, .SDcols = cols]
df
# visit test1 test2
#1: A 3 3
#2: B 3 1
#3: C 1 1