R中的散点图具有大的重叠和3000+点

时间:2016-07-12 15:49:19

标签: r ggplot2

我正在使用ggplot2在R中制作散点图。我正在比较希拉里和伯尼在小学和教育水平上所获得的投票比例。有许多重叠和方式到很多点。我试图使用透明度,所以我可以看到重叠,但它看起来仍然很糟糕。

My Graph

代码:

demanalyze <- function(infocode, n = 1){
    infoname <- filter(infolookup, column_name == infocode)$description
    infocolumn <- as.vector(as.matrix(mydata[infocode]))
    ggplot(mydata) +
    aes(x = infocolumn) +
    ggtitle(infoname) +
    xlab(infoname) +
    ylab("Fraction of votes each canidate recieved") +
    xlab(infoname) +
    geom_point(aes(y = sanders_vote_fraction, colour = "Bernie Sanders")) +#, color = alpha("blue",0.02), size=I(1)) +
    stat_smooth(aes(y = sanders_vote_fraction), method = "lm", formula = y ~ poly(x, n), size = 1, color = "darkblue", se = F) +
    geom_point(aes(y = clinton_vote_fraction, colour = "Hillary Clinton")) +#, color = alpha("red",0.02), size=I(1)) +
    stat_smooth(aes(y = clinton_vote_fraction), method = "lm", formula = y ~ poly(x, n), size = 1, color = "darkred", se = F) +
    scale_colour_manual("", 
        values = c("Bernie Sanders" = alpha("blue",0.02), "Hillary Clinton" = alpha("red",0.02))
    ) +
    guides(colour = guide_legend(override.aes = list(alpha = 1)))
}

我可以改变什么来使重叠区域看起来不那么混乱?

1 个答案:

答案 0 :(得分:3)

在2维上绘制大量点的标准方法是使用2D密度图:

可重复的例子:

x1 <- rnorm(1000, mean=10)
x2 <- rnorm(1000, mean=10)
y1 <- rnorm(1000, mean= 5)
y2 <- rnorm(1000, mean = 7)


mydat <- data.frame(xaxis=c(x1, x2), yaxis=c(y1, y2), lab=rep(c("H","B"),each=1000))
head(mydat)

library(ggplot2)
##Dots and density plots (kinda messy, but can play with alpha)
p1 <-ggplot(mydat) + geom_point(aes(x=xaxis, y = yaxis, color=lab),alpha=0.4) +
stat_density2d(aes(x=xaxis, y = yaxis, color=lab))
p1

dots and radii

## just density
p2 <-ggplot(mydat) + stat_density2d(aes(x=xaxis, y = yaxis, color=lab))
p2

density plot

可以使用许多参数,因此请查看here以获取ggplot2中绘图类型的完整信息。