ggplot2中的比例密度图具有相同的x轴范围

时间:2018-07-17 13:22:26

标签: r ggplot2 dplyr purrr

我想覆盖两个密度图;一种是转换前的数据,另一种是转换后的数据。我不在乎x和y值,仅在乎曲线的形状。

即使x轴不同,我也希望将给定Predictor的2个图表相互叠加。我发现很难看清这两个方面。实际上,同样会有更多的图,因此将未转换和转换后的数据合并到一个图中将是最佳解决方案。

library(tidyverse)
require(caret)
data(BloodBrain)
bbbTrans <- preProcess(select(bbbDescr, adistd, adistm, dpsa3, inthb), method = "YeoJohnson")
bbbTransData <- predict(bbbTrans, select(bbbDescr, adistd, adistm, dpsa3, inthb)) 
dat <- bbbTransData %>%
  gather(Predictor, Value) %>%
  mutate(Transformation = "Yeo-Johnson") %>%
  bind_rows(data.frame(gather(select(bbbDescr, adistd, adistm, dpsa3, inthb), Predictor, Value), Transformation = "NA", stringsAsFactors = FALSE))  



# For the predictor adistd, I would like the x-axis range to be 0:12.5 for the
# "Yeo-Johnson" transformation and 0:250 for no transformation.  In this plot, it
# is hard to see the shape of the transformed variables due to the different x-value range.
dat %>% ggplot(aes(x = Value, color = Transformation)) +  
  geom_density(aes(y = ..scaled..), position = "dodge") + 
  facet_wrap(~Predictor, scales = "free")


# i.e., I want to superimpose the 2 charts for a given Predictor on top of each other, even though the x-axis is different
# I find it hard to look across the two facets.  In reality, as well, there will be a lot more plots, so combining the non-transformed and transformed data into the one plot using colour would be the best solution.
  filter(dat, Transformation != 'NA') %>% ggplot(aes(x = Value, y = ..scaled..)) +  
  geom_density() + 
  facet_wrap(~Predictor, scales = "free")

  filter(dat, Transformation == 'NA') %>% ggplot(aes(x = Value, y = ..scaled..)) +  
  geom_density() + 
  facet_wrap(~Predictor, scales = "free")

编辑:我认为我需要的算法是(并且更喜欢使用tidyverse):

  1. 按预测变量/转换分组
  2. 获取每个的密度
  3. 将密度x转换为(x-xmin)/(xmax-xmin),以使其在0到1之间
  4. 绘制变换后的密度$ x,密度$ y

1 个答案:

答案 0 :(得分:1)

可缩放(base::scale)并计算密度(stats::density)的解决方案。 density函数输出相同数量的等距点,因此我们可以根据需要将它们从0排列到1

# How many points we want 
nPoints <- 1e3

# Final result
res <- list()

# Using simple loop to scale and calculate density
combinations <- expand.grid(unique(dat$Predictor), unique(dat$Transformation))
for(i in 1:nrow(combinations)) {
    # Subset data
    foo <- subset(dat, Predictor == combinations$Var1[i] & Transformation == combinations$Var2[i])
    # Perform density on scaled signal
    densRes <- density(x = scale(foo$Value), n = nPoints)
    # Position signal from 1 to wanted number of points
    res[[i]] <- data.frame(x = 1:nPoints, y = densRes$y, 
                           pred = combinations$Var1[i], trans = combinations$Var2[i])
}
res <- do.call(rbind, res)
ggplot(res, aes(x / nPoints, y, color = trans, linetype = trans)) +
    geom_line(alpha = 0.5, size = 1) +
    facet_wrap(~ pred, scales = "free")

enter image description here