我想生成我的分类变量子集的均值和频率。
mtcars2 <- mtcars %>% mutate(across(matches('cyl|gear|carb'), as.factor))
我知道我可以用它来分别获得分类和连续的输出。
mtcars_out <- tbl_summary(mtcars2,
statistic = list(all_numeric() ~ "{mean} ({sd})",
all_categorical() ~ "{n} / {N} ({p}%)")) %>% as_tibble()
由于mtacrs $ cyl已经具有“级别”相关联,因此我想按原样使用mtcars2并生成该变量的均值。像这样...但是tbl_summary不喜欢这样,因为它是一个分类变量。
mtcars_out <- tbl_summary(mtcars2,
statistic = list(all_numeric() ~ "{mean} ({sd})",
"cyl"~"{mean} ({sd})")) %>% as_tibble()
Error: Problem with `mutate()` input `tbl_stats`.
x There was an error assembling the summary statistics for 'cyl'
with summary type 'categorical'.
There are 2 common sources for this error.
1. You have requested summary statistics meant for continuous
variables for a variable being as summarized as categorical.
To change the summary type to continuous, add the argument
`type = list(cyl ~ 'continuous')`
2. One of the functions or statistics from the `statistic=` argument is not valid.
i Input `tbl_stats` is `pmap(...)`.
我尝试在调用中指定类型,但这也不起作用。
mtcars_out <- tbl_summary(mtcars2,
type = list("cyl"~"continuous"),
statistic = list(all_numeric() ~ "{mean} ({sd})",
all_categorical() ~ "{n} / {N} ({p}%)")) %>% as_tibble()
Error: Problem with `mutate()` input `summary_type`.
x Column 'cyl' is class "factor" and cannot be summarized as a continuous variable.
i Input `summary_type` is `assign_summary_type(...)`.
我的实际数据集有500个变量,并且已经为每个变量指定了类,所以我不想更改原始数据集的类类型。我想在tbl_summary调用中指定它。
任何帮助将不胜感激!
答案 0 :(得分:0)
您已将cyl
设为因子,R不允许您取因子变量的平均值。
我认为对您来说,最简单的方法是获取变量的数字版本和因子版本。从那里您可以总结两个变量。从那里,您可以删除多余的标题行(用于变量的因子版本)。
library(gtsummary)
library(tidyverse)
tbl <-
mtcars %>%
select(cyl) %>%
mutate(fct_cyl = factor(cyl)) %>%
tbl_summary(
type = where(is.numeric) ~ "continuous",
statistic = where(is.numeric) ~ "{mean} ({sd})",
label = cyl ~ "No. Cylinders"
)
# remove extra header row for factor variables
tbl$table_body <-
tbl$table_body %>%
filter(!(startsWith(variable, "fct_") & row_type == "label"))
# print table
tbl