使用R“tables”包累积计数/百分比和组总计的漂亮表

时间:2017-10-26 22:44:14

标签: r html-table tabular cumulative-sum

我正在尝试生成一个格式化的html表,其中包含频率,累积频率,列百分比和累积列百分比的列。该表还应具有由分组变量子集化的数据,并包括组总数。

我几乎可以使用dplyr和tidyr的组合实现这一点,但输出是一个看起来不那么漂亮的数据帧。我想知道使用tables::tabulate命令是否有更简单的方法?

# Sample data
dat <- data.frame(
  id = 1:100, 
  group = factor(sample(c("A", "B"), 100, replace = TRUE)),
  sessions = factor(sample(1:10, 100, replace = TRUE))
)

# dplyr/tidyr solution
library(dplyr)
library(tidyr)
dat %>% 
  group_by(group, sessions) %>% 
  tally() %>% 
  spread(key = group, value = n) %>% 
  mutate(All = rowSums(.[-1])) %>% 
  gather(key = group, value = n, -sessions) %>% 
  group_by(group) %>% 
  mutate(
    cum_n = cumsum(n),
    p = round(n / sum(n)*100,1),
    cum_p = round(cum_n / sum(n)*100,1),
  ) %>% 
  data.frame() %>% 
  reshape(timevar = "group", idvar = "sessions", direction = "wide")

# As far as I get using tables::tabulate
library(tables)
tabular(
  Factor(sessions, "Sessions") ~ 
    (Heading()*group + 1) * 
    (
      (n = 1) + 
        # (cum_n = ??) +
        Heading("%")*Percent(denom = "col")*Format(digits = 2) 
        # + Heading("cum_%")*??*Format(digits = 2)
      ),
  data = dat
)

1 个答案:

答案 0 :(得分:2)

我建议使用knitr::kablekableExtra,用于生成表格的惊人包。您还可以将其设置为多种格式输出,例如使用相同的代码为 pdf 生成htmllatex

library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

dat %>% 
  group_by(group, sessions) %>% 
  tally() %>% 
  spread(key = group, value = n) %>% 
  mutate(All = rowSums(.[-1])) %>% 
  gather(key = group, value = n, -sessions) %>% 
  group_by(group) %>% 
  mutate(
    cum_n = cumsum(n),
    p = round(n / sum(n)*100,1),
    cum_p = round(cum_n / sum(n)*100,1),
  ) %>% 
  data.frame() %>% 
  reshape(timevar = "group", idvar = "sessions", direction = "wide") %>%
  kable("html") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))

enter image description here