我想构造一个新表并按组从旧表中获取统计摘要。 这就是我所拥有的:
df <- data.frame(
number = c(3,4,5,6,7,3,5,6,7,6),
group= c("red", "yellow", "green", "green", "yellow", "yellow", "red", "red", "red", "green")
)
请参见图片以获取所需的数据框。谢谢你!
答案 0 :(得分:0)
如果创建具有阈值的表并将其合并到表中,则可以使用dplyr
轻松获得值。
library(dplyr)
gorder <- c("red","yellow","green")
df %>%
left_join(data.frame(group=gorder, thresh=c(5, 6, 2))) %>%
group_by(group) %>%
summarize(numbers=sum(number), above_thresh=sum(number>thresh), thresh=first(thresh))
# # A tibble: 3 x 4
# group numbers above_thresh thresh
# <fct> <dbl> <int> <dbl>
# 1 green 17 3 2
# 2 red 21 2 5
# 3 yellow 14 1 6
如果您确实需要输出的“形状”,则可以引入tidyr
来帮助调整输出的形状
library(dplyr)
library(tidyr)
gorder <- c("red","yellow","green")
norder <- c("numbers", "thresh", "above_thresh")
df %>%
left_join(data.frame(group=gorder, thresh=c(5, 6, 2))) %>%
group_by(group) %>%
summarize(numbers=sum(number), above_thresh=sum(number>thresh), thresh=first(thresh)) %>%
pivot_longer(-group) %>%
mutate(group=factor(group, levels=gorder), name=factor(name, levels=norder)) %>%
arrange(group, name) %>%
pivot_wider(names_from=group, values_from=value)
# # A tibble: 3 x 4
# name red yellow green
# <fct> <dbl> <dbl> <dbl>
# 1 numbers 21 14 17
# 2 thresh 5 6 2
# 3 above_thresh 2 1 3