我正在尝试使用ggsurvplot_facet()函数绘制由变量sex生成的多个变量facet的生存曲线。当我将代码应用于单个拟合模型时,它可以正常工作。但是,当我尝试在函数或for循环中使用相同的代码时,它无法绘制应绘制的所有生存曲线,并返回错误。如果以ggsurvplot()相同的方式作为输入允许survfit元素的列表,我将在ggsurvplot_facet()本身中执行此绘制,但是ggsurvplot_facet()一次仅允许单个survfit元素。
我正在使用Mac OS High Sierra的2018年MacBook Pro在RStudio中运行代码。
考虑以下数据集:http://s000.tinyupload.com/index.php?file_id=01704535336107726906
它包含对100个主题和4个不同变量的多次访问的观察结果。其中两个变量(变量1和变量2)可以具有两个不同的值(0或1),另外两个变量(变量3和变量4)可以具有三个不同的值(0、1或2)。
我已经开始使用可以具有两个不同值的值,并且我编写了以下代码:
# Load libraries
require(mgcv)
require(msm)
library(dplyr)
library(grDevices)
library(survival)
library(survminer)
# Set working directory
dirname<-dirname(rstudioapi::getSourceEditorContext()$path)
setwd(dirname)
load("ggsurvplot_facet_error.rda")
fit_test <- survfit(
Surv(follow_up, as.numeric(status)) ~ (sex + variable1), data = data)
plot_test <- ggsurvplot_facet(fit_test,
data = data,
pval = TRUE,
conf.int = TRUE,
surv.median.line = "hv", # Specify median survival
break.time.by = 1,
facet.by = "sex",
ggtheme = theme_bw(), # Change ggplot2 theme
palette = "aaas",
legend = "bottom",
xlab = "Time (years)",
ylab = "Death probability",
panel.labs = list(sex_recoded=c("Male", "Female")),
legend.labs = c("A", "B")
)
plot_test
此代码很好用,并生成以下图:
但是,当我尝试将此代码转换为函数或FOR循环,以便将相同的代码应用于variable1和variable2时,在绘制步骤的颜色/调色板部分始终会出现错误。
# Variables_with_2_categories: variable1 and variable2
two <- c("variable1", "variable2")
## TEST #1: USING A FUNCTION
fit_plot_function <- function(x) {
# FIT part of the function
two.i <- two[i]
fit_temp <- survfit(Surv(as.numeric(follow_up), as.numeric(status)) ~
sex + eval(as.name(paste0(two.i))), data = data)
# PLOT part of the function
plot_temp <- ggsurvplot_facet(fit_temp,
data = data,
pval = TRUE,
conf.int = TRUE,
surv.median.line = "hv", # Specify median survival
break.time.by = 1,
facet.by = "sex",
ggtheme = theme_bw(), # Change ggplot2 theme
palette = "aaas",
legend = "bottom",
xlab = "Time (years)",
ylab = "Death probability",
panel.labs = list(sex_recoded=c("Male", "Female")),
legend.labs = rep(c("A", "B"),2)
)
}
fit_plot_function(two)
# Warning message:
# Now, to change color palette, use the argument palette=
# 'eval(as.name(paste0(two.i)))' instead of color = 'eval(as.name(paste0(two.i)))'
print(plot_temp)
# Error in grDevices::col2rgb(colour, TRUE) :
# invalid color name 'eval(as.name(paste0(two.i)))'
当它评估用向量解析的变量的名称时,似乎无法识别变量名称。使用FOR循环,其发生的过程完全相同:
## TEST #2: USING A FOR LOOP
n.two <- length(two)
for(i in 1:n.two) {
two.i <- two[i]
fit_temp <- survfit(Surv(as.numeric(follow_up), as.numeric(status)) ~
(sex + eval(as.name(paste0(two.i)))), data = data)
plot_temp <- ggsurvplot_facet(fit_temp,
data = data,
pval = TRUE,
conf.int = TRUE,
surv.median.line = "hv", # Specify median survival
break.time.by = 1,
facet.by = "sex",
ggtheme = theme_bw(), # Change ggplot2 theme
palette = "aaas",
legend = "bottom",
xlab = "Time (years)",
ylab = "Death probability",
panel.labs = list(sex_recoded=c("Male", "Female")),
legend.labs = rep(c("A", "B"),2)
)
}
print(plot_temp)
# ERROR: Now, to change color palette, use the argument palette= 'eval(as.name(paste0(two.i)))'
# instead of color = 'eval(as.name(paste0(two.i)))
作为一个补充说明,如果我可以将相同的代码应用于同时具有两个或三个不同值的变量,而不必为每个变量应用不同的函数,那将是很好的。
非常感谢您的帮助,
最好的问候,
酪蛋白
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] survminer_0.4.3.999 ggpubr_0.2 magrittr_1.5 ggplot2_3.1.1 survival_2.44-1.1
[6] dplyr_0.8.0.1 msm_1.6.7 mgcv_1.8-27 nlme_3.1-137
loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 pillar_1.3.1 compiler_3.5.1 plyr_1.8.4 tools_3.5.1 digest_0.6.18
[7] tibble_2.1.1 gtable_0.3.0 lattice_0.20-38 pkgconfig_2.0.2 rlang_0.3.4 Matrix_1.2-17
[13] ggsci_2.9 rstudioapi_0.10 cmprsk_2.2-7 yaml_2.2.0 mvtnorm_1.0-10 expm_0.999-4
[19] xfun_0.6 gridExtra_2.3 knitr_1.22 withr_2.1.2 survMisc_0.5.5 generics_0.0.2
[25] grid_3.5.1 tidyselect_0.2.5 data.table_1.12.2 glue_1.3.1 KMsurv_0.1-5 R6_2.4.0
[31] km.ci_0.5-2 purrr_0.3.2 tidyr_0.8.3 scales_1.0.0 backports_1.1.4 splines_3.5.1
[37] assertthat_0.2.1 xtable_1.8-3 colorspace_1.4-1 labeling_0.3 lazyeval_0.2.2 munsell_0.5.0
[43] broom_0.5.2 crayon_1.3.4 zoo_1.8-5
答案 0 :(得分:0)
是时候整顿了。您可以使用purrr
完成任何操作。您可以阅读有关制作ggplot2 purrr
here和更多示例here的信息。
首先,我们需要使用tidyr::gather
将您的数据转换为长格式。除了变量1,2,3,4,我们将所有内容保留在数据框中。他们会融化的。
library(tidyr)
library(dplyr)
library(purrr)
data %>%
gather(num, variable, -sample_id, -sex,
-visit_number, -age_at_enrollment,
-follow_up, -status) %>%
mutate(num2 = num) %>% # We'll need this column later for the titles
as_tibble() -> long_data
# A tibble: 2,028 x 8
sample_id sex visit_number age_at_enrollment follow_up status num variable
<fct> <fct> <fct> <dbl> <dbl> <fct> <chr> <int>
1 sample_0001 Female 1 56.7 0 1 variable1 0
2 sample_0001 Female 2 57.7 0.920 1 variable1 0
3 sample_0001 Female 3 58.6 1.90 1 variable1 0
4 sample_0001 Female 4 59.7 2.97 2 variable1 0
5 sample_0001 Female 5 60.7 4.01 1 variable1 0
6 sample_0001 Female 6 61.7 4.99 1 variable1 0
7 sample_0002 Female 1 55.9 0 1 variable1 1
8 sample_0002 Female 2 56.9 1.04 1 variable1 1
9 sample_0002 Female 3 58.0 2.15 1 variable1 1
10 sample_0002 Female 4 59.0 3.08 1 variable1 1
# ... with 2,018 more rows
现在,我们需要将长数据帧转换为嵌套数据帧,然后map
!使用ggsurvplot
时要准确-该函数不支持tibbles
期间创建的nest()
。
long_data %>%
group_by(num) %>%
nest() %>%
mutate(
# Run survfit() for every variable
fit_f = map(data, ~survfit(Surv(follow_up, as.numeric(status)) ~ (sex + variable), data = .)),
# Create survplot for every variable and survfit
plots = map2(fit_f, data, ~ggsurvplot(.x,
as.data.frame(.y), # Important! convert from tibble to data.frame
pval = TRUE,
conf.int = TRUE,
facet.by = "sex",
surv.median.line = "hv",
break.time.by = 1,
ggtheme = theme_bw(),
palette = "aaas",
xlab = "Time (years)",
ylab = "Death probability") +
ggtitle(paste0("This is plot of ", .y$num2)) + # Add a title
theme(legend.position = "bottom"))) -> plots
现在您可以通过键入以下命令来返回绘图:
plots$plots[[1]]
plots$plots[[2]]
plots$plots[[3]]
plots$plots[[4]] # plotted below
并使用map2()
map2(paste0(unique(long_data$num), ".pdf"), plots$plots, ggsave)
更新
不幸的是,我无法弄清楚如何更改图例标签。我可以建议的唯一解决方案如下。请记住,plots$plots[[…]]
是ggplot
对象,因此之后您可以更改所有内容。例如,要更改图例标签,我只需要添加scale_fill_discrete
和scale_color_discrete
。标题,实验室,主题等也可以这样做。
library(ggsci) # to add aaas color palette
plots$plots[[3]] +
labs(title = "Variable 3",
subtitle = "You just have to be the best") +
ggsci::scale_color_aaas(guide = F) +
ggsci::scale_fill_aaas(label = LETTERS[1:3])