R具有NAS宽度的曲线抖动箱线图

时间:2018-10-26 12:07:23

标签: r boxplot na r-plotly jitter

我正在使用以下功能绘制带有抖动的分组箱图:

plot_boxplot <- function(dat) {
  # taking one of each joine_group to be able to plot it
  allx <- dat %>% 
    mutate(y = median(y, na.rm = TRUE)) %>%
    group_by(joined_group) %>% 
    sample_n(1) %>% 
    ungroup()

  p <- dat %>%
    plotly::plot_ly() %>%
    # plotting all the groups 1:20
    plotly::add_trace(data = allx, 
                      x = ~as.numeric(joined_group),
                      y = ~y,
                      type = "box",
                      hoverinfo = "none",
                      boxpoints = FALSE,
                      color = NULL,
                      opacity = 0,
                      showlegend = FALSE) %>% 
    # plotting the boxes
    plotly::add_trace(data = dat, 
                      x = ~as.numeric(joined_group),
                      y = ~y,
                      color = ~group1,
                      type = "box",
                      hoverinfo = "none",
                      boxpoints = FALSE,
                      showlegend = FALSE) %>% 
    # adding ticktext
    layout(xaxis = list(tickvals = 1:20,
                        ticktext = rep(levels(dat$group1), each = 4)))

  p <- p %>%
    # adding jittering
    add_markers(data = dat,
                x = ~jitter(as.numeric(joined_group), amount = 0.2),
                y = ~y,
                color = ~group1,
                showlegend = FALSE)
  p

}

问题在于,当某些级别具有NA作为y变量时,抖动框的宽度会改变。这是一个示例:

library(plotly)
library(dplyr)
set.seed(123)
dat <- data.frame(group1 = factor(sample(letters[1:5], 100, replace = TRUE)),
                  group2 = factor(sample(LETTERS[21:24], 100, replace = TRUE)),
                  y = runif(100)) %>% 
  dplyr::mutate(joined_group = factor(
    paste0(group1, "-", group2)
  ))

# do the plot with all the levels
p1 <- plot_boxplot(dat)

# now the group1 e is having NAs as y values
dat$y[dat$group1 == "e"] <- NA

# create the plot with missing data
p2 <- plot_boxplot(dat)

# creating the subplot to see that the width has changed:
subplot(p1, p2, nrows = 2)

问题在于两个图中的框的宽度不同: enter image description here

我已经意识到盒子的大小没有抖动,所以我知道抖动是“与宽度”有关的,但是我不知道如何解决。 enter image description here

有人知道如何使两个抖动图中的宽度完全相同吗?

1 个答案:

答案 0 :(得分:2)

我看到了两个单独的情节转变:

  1. 由于抖动
  2. 由于NAs

首先可以通过声明具有固定种子的新抖动功能来解决

fixed_jitter <- function (x, factor = 1, amount = NULL) {
  set.seed(42)
  jitter(x, factor, amount)
}

并使用它代替jitter调用中的add_markers

第二个问题可以通过分配-1代替NA并设置

yaxis = list(range = c(0, ~max(1.1 * y)))

作为layout的第二个参数。