我正在尝试为所有列按类变量分组获取mean
,sum
,count
,但对于计数-n()
(第三条语句),我得到了错误
错误:不应直接调用此函数
Class <- c("A","A","A","A","B","B","B","C","C","C","C","C","C")
A<-c(23,33,NA,56,22,34,34,45,65,5,57,75,57)
D<-c(2,133,5,60,23,312,341,25,75,NA,3,9,21)
M<-c(34,35,67,325,46,56,547,47,67,67,68,3,12)
df <- data.frame(Class,A,D,M)
library(dplyr)
system.time(df_sum <- df %>% group_by(Class) %>% summarise_if(is.numeric, sum , na.rm=T))
system.time(df_mean <- df %>% group_by(Class) %>% summarise_if(is.numeric, mean , na.rm=T))
system.time(df_count <- df %>% group_by(Class) %>% summarise_if(is.numeric, n() , na.rm=T))
请建议我上述声明所需的任何修改。
答案 0 :(得分:3)
要获取每个数字列中非NA
值的数量,可以使用:
library(dplyr)
df %>%
group_by(Class) %>%
summarise_if(is.numeric,
function(x) sum(!is.na(x)))
#output
# A tibble: 3 x 4
Class A D M
<fct> <int> <int> <int>
1 A 3 4 4
2 B 3 3 3
3 C 6 5 6
n()
函数几乎没有灵活性,并且没有na.rm
参数