只需找到如何从here创建一个分裂小提琴图,然后如何从here为两个以上的组扩展它。但我不明白如何将其转移到我自己的数据中。
现在,我被困在创建自己的分裂小提琴情节。
我的目标是this;每组应代表一个癌症实体(左侧参考发生率,右侧群体发病率)。
在第一个链接(见上文)后,我使用此代码创建函数geom_split_violin
GeomSplitViolin <- ggproto("GeomSplitViolin", GeomViolin, draw_group = function(self, data, ..., draw_quantiles = NULL){
data <- transform(data, xminv = x - violinwidth * (x - xmin), xmaxv = x + violinwidth * (xmax - x))
grp <- data[1,'group']
newdata <- plyr::arrange(transform(data, x = if(grp%%2==1) xminv else xmaxv), if(grp%%2==1) y else -y)
newdata <- rbind(newdata[1, ], newdata, newdata[nrow(newdata), ], newdata[1, ])
newdata[c(1,nrow(newdata)-1,nrow(newdata)), 'x'] <- round(newdata[1, 'x'])
if (length(draw_quantiles) > 0 & !scales::zero_range(range(data$y))) {
stopifnot(all(draw_quantiles >= 0), all(draw_quantiles <=
1))
quantiles <- create_quantile_segment_frame(data, draw_quantiles)
aesthetics <- data[rep(1, nrow(quantiles)), setdiff(names(data), c("x", "y")), drop = FALSE]
aesthetics$alpha <- rep(1, nrow(quantiles))
both <- cbind(quantiles, aesthetics)
quantile_grob <- GeomPath$draw_panel(both, ...)
ggplot2:::ggname("geom_split_violin", grobTree(GeomPolygon$draw_panel(newdata, ...), quantile_grob))
}
else {
ggplot2:::ggname("geom_split_violin", GeomPolygon$draw_panel(newdata, ...))
}
})
geom_split_violin <- function (mapping = NULL, data = NULL, stat = "ydensity", position = "identity", ..., draw_quantiles = NULL, trim = TRUE, scale = "area", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE) {
layer(data = data, mapping = mapping, stat = stat, geom = GeomSplitViolin, position = position, show.legend = show.legend, inherit.aes = inherit.aes, params = list(trim = trim, scale = scale, draw_quantiles = draw_quantiles, na.rm = na.rm, ...))
}
我不明白上面的完整代码并猜测,这段代码会创建一个只有两组的分裂小提琴情节。
希望我的要求不是一般的,你们可以告诉我如何达到我的情节。
我所到达的只是
my_data <- read.csv("Test2.csv", sep = ";")
y <- data_vp$Age.groups
x <- cbind(data_vp[,3:8])
m <- cbind(data_vp[,3:5], data_vp[,6:8])
ggplot(my_data,
aes(x = x,
y = y,
fill = m)) +
geom_split_violin()
在得到你的答案之后@missuse我再次尝试使用这段代码:
raw_df <- read.csv("Test2.csv", sep = ";")
View(raw_df)
inc_all <- raw_df[,2:7]
inc_oM <- raw_df[,2:4]
inc_mM <- raw_df[,5:7]
dff <- data.frame(y = raw_df$Age.groups,
groups = as.factor(inc_all),
split = as.factor(inc_oM + inc_mM))
ggplot(dff,
aes(x = groups,
y = y,
fill = split)) +
geom_split_violin()
不起作用:(
答案 0 :(得分:0)
以下是如何将geom_split_violin
与任意数量的组一起使用的示例:
首先是一些数据:
df <- data.frame(dens = rnorm(1000),
split = as.factor(sample(1:2, 1000, replace = T)),
groups = as.factor(rep(1:5, each = 200)))
非常直观:
library(ggplot2)
ggplot(df, aes(groups, dens, fill = split)) +
geom_split_violin(alpha = 0.7)
您可能正在努力解决这个问题,因为您的群组不是因素,请将它们转换为ggplot调用中的因子或之前的因素。
编辑:OP提供数据后:
structure(list(Age.groups = structure(1:18, .Label = c("0-04",
"05-09", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39",
"40-44", "45-49", "50-54", "55-59", "60-64", "65-69", "70-74",
"75-79", "80-84", "85+"), class = "factor"), Magen = c(0, 0,
0, 0.1, 0.2, 0.5, 1.4, 2.4, 4.4, 7.6, 13.3, 20.8, 30.3, 40.6,
56.3, 76, 97, 113.3), MH = c(0.1, 0.5, 1.5, 3.7, 4.6, 4.1, 3.4,
3.1, 2.6, 2.4, 2.4, 2.4, 2.8, 3.1, 3.5, 4.4, 4.1, 2.9), NHL = c(0.6,
1, 1.2, 1.9, 2.2, 3, 3.7, 5.2, 7.8, 10.6, 16.1, 23.2, 33.5, 47,
61.1, 73.6, 84.5, 75.7), Magen_M = c(0L, 0L, 0L, 20L, 0L, 20L,
20L, 0L, 40L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), MH_M = c(0L,
0L, 0L, 4L, 0L, 2L, 2L, 0L, 0L, 0L, 2L, 0L, 0L, 0L, 0L, 0L, 0L,
0L), NHL_M = c(0L, 0L, 0L, 0L, 20L, 0L, 0L, 0L, 0L, 20L, 20L,
0L, 20L, 0L, 0L, 0L, 0L, 0L)), .Names = c("Age.groups", "Magen",
"MH", "NHL", "Magen_M", "MH_M", "NHL_M"), class = "data.frame", row.names = c(NA,
-18L))
很明显,年龄是在箱子里,密度不合适。我建议绘制一个类似于分割密度的geom_col
图形:
首先应将数据转换为长格式,并对格式进行一些调整:
library(tidyverse)
my_data %>%
gather(key, value, 2:7) %>% #convert all values desired to be in `x` axes to long format
mutate(split = as.factor(ifelse(grepl("_M$", key), 1, 0)), #make an additional split variable
key = gsub("_M$", "", key), #remove the _M at the end of the 3 variables they are now defined by the split variable
value2 = ifelse(split == 1, value, -value)) -> dat #make values for one group negative so it resembles geom_split violin.
ggplot(dat, aes(x = Age.groups,
y = value2,
fill = split)) +
geom_col()+
facet_wrap(~ key, scales = "free_x")+
coord_flip() +
scale_y_continuous(labels = abs) #make the values absoulte