使用geom_violin中的函数删除小样本的类别

时间:2017-03-10 20:02:18

标签: r dataframe ggplot2

我想请求一种方法来删除在小提琴图中不显示任何内容的类别图(小于3个?)(图中的BB和CCC)。但保留整个样本子图的所有数据。有没有比过滤数据框和附加原始副本(对于整个样本子图)更简单的方法。

# example df
library(ggplot2)
b<-abs(round(rnorm(8, sd=30)))
y<-runif(5)
pr<-y/sum(y)
names<-unlist(lapply(mapply(rep, LETTERS[1:5], 1:5), function (x) paste0(x, collapse = "") ) )
x <- sample(names, 8, replace=TRUE, prob=pr)
x
df<-data.frame(name=x,numbers=b)

violinplot_fun <- function(dataset, var, groupcol, adjust1, maxx) {
  ggplot(dataset)+
    geom_violin(aes_string(y = var, x = groupcol), scale = "width", 
                alpha = 0.4, adjust = adjust1) + 
    geom_violin(aes_(y = as.name(var), x = "Whole sample"), scale = "width", 
                alpha = .4, adjust = adjust1) +
    scale_y_continuous(limits = c(0,ceiling(maxx)) , breaks = scales::pretty_breaks(15) ) + 
    coord_flip()
} 

violinplot_fun(df,"numbers", "name",0.5,100)

enter image description here

1 个答案:

答案 0 :(得分:1)

如果您在使用data.table包在函数中调用数据框之前编辑数据框,则可以这样做:

dt <- as.data.table(df)
dt1 <- dt[, n := .N, by = name]

编辑我稍微改变了你的功能:

violinplot_fun <- function(dataset, dataset_orig, var, groupcol, adjust1, maxx) {
  ggplot(dataset)+
    geom_violin(aes_string(y = var, x = groupcol), scale = "width", 
                alpha = 0.4, adjust = adjust1) + 
    geom_violin(data = dataset_orig, aes_(y = as.name(var), x = "Whole sample"), 
                scale = "width", alpha = .4, adjust = adjust1) +
    scale_y_continuous(limits = c(0,ceiling(maxx)) , breaks = scales::pretty_breaks(15) ) + 
    coord_flip()
} 

violinplot_fun(dt1[n >= 3,], dataset_orig = dt1, "numbers", "name",0.5,100)

给你这个:

enter image description here

此外,如果你知道你不会改变个人接受的门槛(即3),那么你可以像这样编写你的函数,这样你只需要输入一个数据集参数:

violinplot_fun <- function(dataset, var, groupcol, adjust1, maxx) {
  ggplot(dataset[n >= 3]) +
    geom_violin(aes_string(y = var, x = groupcol), scale = "width", 
            alpha = 0.4, adjust = adjust1) + 
    geom_violin(data = dataset, aes_(y = as.name(var), x = "Whole sample"), scale = "width", 
            alpha = .4, adjust = adjust1) +
    scale_y_continuous(limits = c(0,ceiling(maxx)) , breaks = scales::pretty_breaks(15) ) + 
    coord_flip()
} 

或者,您可以将阈值设置为参数。