构造一个新的数据框并在R Studio中获取统计摘要

时间:2019-10-09 14:14:50

标签: r dataframe

我想构造一个新表并按组从旧表中获取统计摘要。 这就是我所拥有的:

df <- data.frame(
  number =  c(3,4,5,6,7,3,5,6,7,6),
  group= c("red", "yellow", "green", "green", "yellow", "yellow", "red", "red", "red", "green")
)

请参见图片以获取所需的数据框。谢谢你!

enter image description here

1 个答案:

答案 0 :(得分:0)

如果创建具有阈值的表并将其合并到表中,则可以使用dplyr轻松获得值。

library(dplyr)
gorder <- c("red","yellow","green")
df %>% 
  left_join(data.frame(group=gorder, thresh=c(5, 6, 2))) %>% 
  group_by(group) %>% 
  summarize(numbers=sum(number), above_thresh=sum(number>thresh), thresh=first(thresh)) 
# # A tibble: 3 x 4
#   group  numbers above_thresh thresh
#   <fct>    <dbl>        <int>  <dbl>
# 1 green       17            3      2
# 2 red         21            2      5
# 3 yellow      14            1      6

如果您确实需要输出的“形状”,则可以引入tidyr来帮助调整输出的形状

library(dplyr)
library(tidyr)
gorder <- c("red","yellow","green")
norder <- c("numbers", "thresh", "above_thresh")
df %>% 
  left_join(data.frame(group=gorder, thresh=c(5, 6, 2))) %>% 
  group_by(group) %>% 
  summarize(numbers=sum(number), above_thresh=sum(number>thresh), thresh=first(thresh)) %>% 
  pivot_longer(-group) %>% 
  mutate(group=factor(group, levels=gorder), name=factor(name, levels=norder)) %>% 
  arrange(group, name) %>% 
  pivot_wider(names_from=group, values_from=value)
# # A tibble: 3 x 4
#   name           red yellow green
#   <fct>        <dbl>  <dbl> <dbl>
# 1 numbers         21     14    17
# 2 thresh           5      6     2
# 3 above_thresh     2      1     3