如何在数据框上应用函数以获取描述性统计信息

时间:2019-01-26 12:30:11

标签: r statistics

我有一个包含两个数据帧的列表,其中一个变量(年份)应该是因子,而另一个变量是数字,我想要它的描述性。这是我的列表的一个示例:

> D1
   Year    value
1  1386 7.544808
2  1387 7.552638
3  1387 7.572596
4  1387 7.790549
5  1388 7.607089
6  1388 7.635559
7  1389 7.469881
8  1389 7.622461
9  1389 7.622461
10 1390 7.596479
11 1390 7.645063
12 1391 7.654853
13 1391 7.605891
14 1392 7.612247
15 1381 7.747241
16 1383 7.808759
17 1383 7.834336
18 1384 7.482341
19 1384 7.433035

> D2
   Year    value
1  1386 7.544808
2  1387 7.552638
3  1387 7.572596
4  1387 7.790549
5  1388 7.607089
6  1388 7.635559
7  1389 7.469881
8  1389 7.622461
9  1389 7.622461
10 1390 7.596479
11 1390 7.645063
12 1391 7.654853
13 1391 7.605891
14 1392 7.612247
15 1381 7.747241
16 1383 7.808759
17 1383 7.834336
18 1384 7.482341
19 1384 7.433035

My_list<-list(Labe1=D1,Label2=D2)

现在,我想将我的以下函数应用到上面的列表中,以生成描述性统计数据,用于统计不同年份类别的变量值。

# take mean with confience interval from columns
MeanFunc<-function(x) round(mean(x,na.rm = TRUE),digits=6 )
SEFunc<-function(x) round(qt(0.975,df=sum(!is.na(x))-1)*sd(x,na.rm = TRUE)/sqrt(sum(!is.na(x)) ),digits=5 )
SDFunc<-function(x) round(sd(x,na.rm = TRUE),digits=5 )
LeftFunc<-function(x)  round(mean(x,na.rm = TRUE)-SEFunc(x),digits=5) 
RightFunc<-function(x) round(mean(x,na.rm = TRUE)+SEFunc(x),digits=5)  
MaxFunc<-function(x) round(max(x,na.rm = TRUE) ,digits=5)  
MinFunc<-function(x) round(min(x,na.rm = TRUE) ,digits=5) 

multi.fun <- function(x) {
  c(Mean = MeanFunc(x), SE = SEFunc(x), SD = SDFunc(x), Left=LeftFunc(x),Right=RightFunc(x),Max=MaxFunc(x),Min=MinFunc(x))
} 

现在如何生成类似于此列表的输出?:

$Lable1
Mean      SE      SD    Left   Right     Max     Min
value 7.407750 0.02683 0.35525 7.38092 7.43458 8.54102 5.90301
1381  0.203978 0.09325 1.23486 0.11073 0.29723 8.08833 0.00000
1382  0.078627 0.05813 0.76970 0.02050 0.13676 7.99239 0.00000
1383  0.635951 0.16005 2.11930 0.47590 0.79600 8.54102 0.00000
1384  0.422948 0.13113 1.73636 0.29182 0.55408 8.20205 0.00000
1385  0.267271 0.10543 1.39602 0.16184 0.37270 8.30430 0.00000
1386  0.354070 0.12012 1.59055 0.23395 0.47419 7.85514 0.00000
1387  1.279604 0.21165 2.80268 1.06795 1.49125 8.23982 0.00000
$Lable2
Mean      SE      SD    Left   Right     Max     Min
value 7.407750 0.02683 0.35525 7.38092 7.43458 8.54102 5.90301
1381  0.203978 0.09325 1.23486 0.11073 0.29723 8.08833 0.00000
1382  0.078627 0.05813 0.76970 0.02050 0.13676 7.99239 0.00000
1383  0.635951 0.16005 2.11930 0.47590 0.79600 8.54102 0.00000
1384  0.422948 0.13113 1.73636 0.29182 0.55408 8.20205 0.00000
1385  0.267271 0.10543 1.39602 0.16184 0.37270 8.30430 0.00000
1386  0.354070 0.12012 1.59055 0.23395 0.47419 7.85514 0.00000
1387  1.279604 0.21165 2.80268 1.06795 1.49125 8.23982 0.00000

非常感谢...

1 个答案:

答案 0 :(得分:0)

检查此解决方案:

library(tidyverse)
library(plotrix)

My_list %>%
  map(
    ~group_by(.x, Year) %>% 
    summarise(
      Mean = mean(value, na.rm = TRUE) %>% round(6),
      SE = std.error(value, na.rm = TRUE),
      SD = sd(value,na.rm = TRUE),
      Left = mean(value,na.rm = TRUE) - std.error(value, na.rm = TRUE),
      Right = mean(value,na.rm = TRUE) + std.error(value, na.rm = TRUE),
      Max = max(value,na.rm = TRUE),
      Min = min(value,na.rm = TRUE)
    ) %>%
      mutate_at(3:8, ~round(.x, 5))
  )