这是一个示例数据集:
structure(list(LD_wday = c(6, 2, 6, 1, 4, 4, 7, 6, 1, 3, 1, 3,
6, 1, 6, 4, 7, 7, 6, 2, 7, 1, 5, 2, 2, 2, 3, 3, 5, 1, 2, 5, 1,
6, 3, 4, 3, 4, 1, 6, 3, 6, 2, 6, 5, 5, 4, 3, 5, 6), status = c("successful",
"failed", "live", "successful", "failed", "successful", "failed",
"successful", "successful", "successful", "live", "successful",
"successful", "failed", "failed", "successful", "failed", "live",
"successful", "successful", "failed", "live", "successful", "successful",
"failed", "successful", "successful", "successful", "failed",
"failed", "failed", "failed", "failed", "successful", "live",
"failed", "live", "successful", "successful", "successful", "successful",
"failed", "failed", "live", "successful", "failed", "successful",
"failed", "failed", "successful")), row.names = c(NA, -50L), class = c("tbl_df",
"tbl", "data.frame"))
我一直使用group_by和summary,但最终得到类似下面的输出。如何从提供的数据集中创建成功/失败比率?
sample %>%
filter(status == "failed" | status == "successful") %>%
group_by(LD_wday, status) %>%
summarize(count = n())
OUTPUT:
# A tibble: 13 x 3
# Groups: LD_wday [7]
LD_wday status count
<dbl> <chr> <int>
1 1 failed 3
2 1 successful 3
3 2 failed 4
4 2 successful 3
5 3 failed 1
6 3 successful 5
7 4 failed 2
8 4 successful 4
9 5 failed 4
10 5 successful 2
11 6 failed 2
12 6 successful 7
13 7 failed 3
任何帮助将不胜感激,对于表达问题有困难,我深表歉意。
答案 0 :(得分:3)
如果我们想创建两者的比率,则可以除以{count}的sum
,因为它已经被'LD_wday'分组了
library(dplyr)
sample %>%
filter(status == "failed" | status == "successful") %>%
group_by(LD_wday, status) %>%
summarize(count = n()) %>%
mutate(status = count/sum(count))