如何从包含因子的整个表中提取描述性统计信息?

时间:2019-02-21 11:11:45

标签: r

R中是否有一种方法可以将多个描述变量(例如均值,中位数和置信区间)提取到单独的数据框中?

这是我用来生成数据框的代码:

health <- data.frame(ID=c(1,2,3,4,5,6,7,8,9,10), Stroke = factor(c(0,0,1,0,0,1,0,0,0,1)), 
                     Diab = factor(c(0,0,0,0,0,1,0,0,0,1)), MI = factor(c(0,0,0,0,0,1,0,0,0,1)),
                     Age = factor(c(65,66,78,55,67,66,79,54,65,78)), 
                     Sex = factor(c("M","M","F","M","M","M","F","M","F","F")))

这是数据框的外观:

   ID Stroke Age Sex MI_imp[, 1] diab_imp[, 1]
1   1      0  65   M           0             0
2   2      0  66   M           0             0
3   3      1  78   F           0             0
4   4      0  55   M           0             0
5   5      0  67   M           0             0
6   6      1  66   M           1             1
7   7      0  79   F           0             0
8   8      0  54   M           0             0
9   9      0  65   F           0             0
10 10      1  78   F           1             1

我已经尝试运行此代码以提取置信区间并返回错误:

sapply(health_imp[-1], quantile, probs=c(0.5, 0.05, 0.95), na.rm=TRUE)
Error in quantile.default(X[[i]], ...) : factors are not allowed

1 个答案:

答案 0 :(得分:0)

summary()函数可能就是您想要的:

health<-data.frame(ID=c(1,2,3,4,5,6,7,8,9,10), Stroke = factor(c(0,0,1,0,0,1,0,0,0,1)), Diab = factor(c(0,0,0,0,0,1,0,0,0,1)), MI = factor(c(0,0,0,0,0,1,0,0,0,1)), Age = factor(c(65,66,78,55,67,66,79,54,65,78)), Sex = factor(c("M","M","F","M","M","M","F","M","F","F")))
summary(health)

并保存到另一个数据框中

health_table <- summary(health)
health_df <- as.data.frame(health_table)