Question

当我尝试使用add_p（）函数为按变量（具有10个级别）和具有两个级别（是/否）的分类变量之间的差异获取p值时，出现以下错误。我不确定如何提供可复制的示例。根据试验数据，我想我的 by 变量将是具有10个级别的“ T阶段”变量，而分类变量将是：（1）具有2个级别的“化学疗法治疗”，以及（2 ）具有4个级别的“化学疗法治疗2”。但是这是我运行的代码。

library(gtsummary)
library(tidyverse)
miro_def %>% 
  select(mheim, age_dx, time_t1d_yrs, gender, collard, fhist_pandz) %>% 
  tbl_summary(by = mheim, missing = "no",
              type = list(c(gender, collard, fhist_pandz, mheim) ~ "categorical"),
              label = list(gender ~ "Gender", 
                           fhist_pandz ~ "Family history of PD", 
                           age_dx ~ "Age at diagnosis", 
                           time_t1d_yrs ~ "Follow-up(years)")) %>% 
  add_p() %>% 
  # style the output with custom header 
  #modify_header(stat_by = "{level}") %>% 
  # convert to kableExtra as_kable_extra(booktabs = TRUE) %>% 
  # reduce font size to make table fit. # you may also use the `latex_options = "scale_down"` argument here. 
  kable_styling(font_size = 7, latex_options = "scale_down")

但是，我确实通过一个变量（10个级别）和其他变量（连续/数字）获得了p值

如何解决此错误？
如果我有提到的多级按变量和多级（> 2级）分类变量，我应该做一些特别的事情来获得p值吗？

变量“ gender”和测试“ fisher.test”的“ add_p（）”错误，省略了p值： stats :: fisher.test（data [[variable]]，as.factor（data [[by]]））中的错误：FEXACT错误7（位置）。 LDSTP = 18540对于这个问题来说太小了，（pastp = 51.2364，ipn_0：= ipoin [itp = 150] = 215，stp [ipn_0] = 40.6787）。增加工作空间或考虑使用'simulate.p.value = TRUE' 变量'collard'和测试'fisher.test'的'add_p（）'错误，省略了p值： stats :: fisher.test（data [[variable]]，as.factor（data [[by]]））中的错误：FEXACT错误7（位置）。 LDSTP = 18570对于这个问题来说太小了，（pastp = 37.0199，ipn_0：= ipoin [itp = 211] = 823，stp [ipn_0] = 23.0304）。增加工作空间或考虑使用'simulate.p.value = TRUE' 变量“ fhist_pandz”和测试“ fisher.test”的“ add_p（）”错误，省略了p值： stats :: fisher.test（data [[variable]]，as.factor（data [[by]]））中的错误：FEXACT错误7（位置）。 LDSTP = 18570对于这个问题来说太小了，（pastp = 36.4614，ipn_0：= ipoin [itp = 58] = 1，stp [ipn_0] = 31.8106）。增加工作空间或考虑使用'simulate.p.value = TRUE'

Answer 1

既然它为我解决了这个问题，我想指出，从 1.3.6 的 gtsummary 版本开始，add_p() 中有一个选项，您可以使用它指定测试函数的参数（即test.args）。感谢开发者为此！

来自NEWS：
每个 add_p() 方法现在都有 test.args = argument。使用此参数传递统计方法的附加参数，例如

add_p(test = c(age, marker) ~ "t.test",
      test.args = c(age, marker) ~ list(var.equal = TRUE))

在 add_p() 帮助（即 ?add_p）中也有说明。

Answer 2

由于没有人发布答案，因此这是我在遇到此问题时所使用的。按照帮助文件?gtsummary::add_p.tbl_summary中给出的示例，我组成了一个自定义函数，该函数使用fisher.test选项运行simulate.p.values = TRUE：

## define custom test
fisher.test.simulate.p.values <- function(data, variable, by, ...) {
  result <- list()
  test_results <- stats::fisher.test(data[[variable]], data[[by]], simulate.p.value = TRUE)
  result$p <- test_results$p.value
  result$test <- test_results$method
  result
}

## add p-values to your gtsummary table, using custom test defined above
summary_table %>%
add_p(
  test = list(all_categorical() ~ "fisher.test.simulate.p.values")  # this applies the custom test to all categorical variables
)

您还可以通过将默认的B = 2000参数更改为上面的fisher.test()来修改用于计算模拟p值的迭代次数。

当然，所有这些都假定首先使用Fisher检验是适当的。

Answer 3

我遇到了类似的问题。您必须在 test.args 内使用 add_p() 增加工作空间。

miro_def %>% 
  select(mheim, age_dx, time_t1d_yrs, gender, collard, fhist_pandz) %>% 
  tbl_summary(by = mheim, missing = "no",
              type = list(c(gender, collard, fhist_pandz, mheim) ~ "categorical"),
              label = list(gender ~ "Gender", 
                           fhist_pandz ~ "Family history of PD", 
                           age_dx ~ "Age at diagnosis", 
                           time_t1d_yrs ~ "Follow-up(years)")) %>% 
  add_p(test.args = variable_with_no_pval ~ list(workspace=2e9))

或

add_p(test.args = all_test("fisher.test") ~ list(workspace=2e9))

变量X和测试'fisher.test'的add_p（）'错误，省略了p值

3 个答案: