我的目的是用ggplot2
(作者:Hadley Wickham)重现此人物 [ref]。
这是我基于geom_point
和一些丑陋的数据准备工作(请参见下面的代码):
如何使用geom_dotplot()
来做到这一点?
在尝试中,我遇到了几个问题:(1)将geom_dotplot产生的默认密度映射到一个计数;(2)切断轴;(3)没有意外的孔。我放弃了,改用了geom_point()
。
我希望(仍然希望)它会像
一样简单ggplot(data, aes(x,y)) + geom_dotplot(stat = "identity")
但没有。所以这是我尝试过的以及输出:
# Data
df <- structure(list(x = c(79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105), y = c(1, 0, 0, 2, 1, 2, 7, 3, 7, 9, 11, 12, 15, 8, 10, 13, 11, 8, 9, 2, 3, 2, 1, 3, 0, 1, 1)), class = "data.frame", row.names = c(NA, -27L))
# dotplot based on geom_dotplot
geom_dots <- function(x, count, round = 10, breaks = NULL, ...) {
require(ggplot2)
n = sum(count) # total number of dots to be drawn
b = round*round(n/round) # prettify breaks
x = rep(x, count) # make x coordinates for dots
if (is.null(breaks)) breaks = seq(0, 1, b/4/n)
ggplot(data.frame(x = x), aes(x = x)) +
geom_dotplot(method = "histodot", ...) +
scale_y_continuous(breaks = breaks,
#limits = c(0, max(count)+1), # doesn't work
labels = breaks * n)
}
geom_dots(x = df$x, count = df$y)
# dotplot based on geom_point
ggplot_dot <- function(x, count, ...) {
require(ggplot2)
message("The count variable must be an integer")
count = as.integer(count) # make sure these are counts
n = sum(count) # total number of dots to be drawn
x = rep(x, count) # make x coordinates for dots
count = count[count > 0] # drop zero cases
y = integer(0) # initialize y coordinates for dots
for (i in seq_along(count))
y <- c(y, 1:(count[i])) # compute y coordinates
ggplot(data.frame(x = x, y = y), aes(x = x, y = y)) +
geom_point(...) # draw one dot per positive count
}
ggplot_dot(x = df$x, count = df$y,
size = 11, shape = 21, fill = "orange", color = "black") + theme_gray(base_size = 18)
# ggsave("dotplot.png")
ggsave("dotplot.png", width = 12, height = 5.9)
简短的随机注释:使用geom_point()
解决方案时,保存图形涉及正确调整尺寸以确保点相互接触(点大小和图形高度/宽度)。使用geom_dotplot()
解决方案,我对标签进行了四舍五入以使其更漂亮。不幸的是,我无法在大约100处切断轴:使用limits()
或coord_cartesian()
会导致整个图的缩放,而不是削减。还要注意,使用geom_dotplot()
是基于计数创建的数据向量,因为我无法直接使用count变量(我期望stat="identity"
可以做到这一点,但我无法做到这一点工作)。
答案 0 :(得分:3)
巧合的是,我也花了整整一天的时间与geom_dotplot()
进行斗争,并努力使它发挥作用。我还没有找到使y轴显示实际数字的方法,但是我已经找到了一种截断y轴的方法。如您所述,coord_cartesian()
和limits
不起作用,但是coord_fixed()
起作用,因为它强制执行x:y单位的比率:
library(tidyverse)
df <- structure(list(x = c(79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105), y = c(1, 0, 0, 2, 1, 2, 7, 3, 7, 9, 11, 12, 15, 8, 10, 13, 11, 8, 9, 2, 3, 2, 1, 3, 0, 1, 1)), class = "data.frame", row.names = c(NA, -27L))
df <- tidyr::uncount(df, y)
ggplot(df, aes(x)) +
geom_dotplot(method = 'histodot', binwidth = 1) +
scale_y_continuous(NULL, breaks = NULL) +
# Make this as high as the tallest column
coord_fixed(ratio = 15)
在这里使用15作为比率是可行的,因为x轴也使用相同的单位(即单个整数)。如果x轴是百分比或对数美元或日期或其他参数,则必须修改比率,直到y轴被截断为止。
使用用于合并地块的方法进行编辑
正如我在下面的评论中提到的那样,使用拼凑而成的图与coord_fixed()
组合起来效果不佳。但是,如果您手动将组合图的高度(或宽度)设置为与coord_fixed()
中的比率相同的值,并确保每个图具有相同的x轴,则可以获得伪面
# Make a subset of df
df2 <- df %>% slice(1:25)
plot1 <- ggplot(df, aes(x)) +
geom_dotplot(method = 'histodot', binwidth = 1) +
scale_y_continuous(NULL, breaks = NULL) +
# Make this as high as the tallest column
# Make xlim the same on both plots
coord_fixed(ratio = 15, xlim = c(75, 110))
plot2 <- ggplot(df2, aes(x)) +
geom_dotplot(method = 'histodot', binwidth = 1) +
scale_y_continuous(NULL, breaks = NULL) +
coord_fixed(ratio = 7, xlim = c(75, 110))
# Combine both plots in a single column, with each sized incorrectly
library(patchwork)
plot1 + plot2 +
plot_layout(ncol = 1)
# Combine both plots in a single column, with each sized appropriately
library(patchwork)
plot1 + plot2 +
plot_layout(ncol = 1, heights = c(15, 7) / (15 + 7))
答案 1 :(得分:2)
离复制足够近了吗?
要到达那里,由于第一个图实际上是直方图,请从计数汇总中将示例数据扩展回每个观察形式的一行。
server <- function(input, output) {
output$distPlot <- renderPlot({
par(mfrow=c(1,2))
switch(input$dist,
"Length" = plot(iris$Sepal.Length),
"Width" = plot(iris$Sepal.Width),
"Length" = plot(iris$Sepal.Length))
switch(input$dist,
"Length" = plot(iris$Sepal.Length),
"Width" = plot(iris$Sepal.Width),
"Length" = plot(iris$Sepal.Length))
})
然后使用df <- tidyr::uncount(df, y)
和method = 'histodot'
将bindwidth=1
转化为直方图形式。
为了美观起见,删除了y轴,因为它是乱码,甚至文档都说它“并没有真正的意义,所以隐藏它”。
geom_dotplot()
答案 2 :(得分:1)
这个问题对an answer to a recent bounty来说有点启发。我决定也将此方法添加到该线程中。
您可以用另一个geom模仿geom_dotplot-我选择ggforce::geom_ellipse
来完全控制您的点。它在y轴上显示计数。我添加了一些行使其更具编程性-并尝试重现Hadley的外观。
这是最终结果:(代码见下文)
首先进行隐式数据修改和几何
library(tidyverse)
library(ggforce)
df <- structure(list(x = c(79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105), y = c(1, 0, 0, 2, 1, 2, 7, 3, 7, 9, 11, 12, 15, 8, 10, 13, 11, 8, 9, 2, 3, 2, 1, 3, 0, 1, 1)), class = "data.frame", row.names = c(NA, -27L))
bin_width <- 1
pt_width <- bin_width / 3 # so that they don't touch horizontally
pt_height <- bin_width / 2 # 2 so that they will touch vertically
count_data <-
data.frame(x = rep(df$x, df$y)) %>%
mutate(x = plyr::round_any(x, bin_width)) %>%
group_by(x) %>%
mutate(y = seq_along(x))
ggplot(count_data) +
geom_ellipse(aes(
x0 = x,
y0 = y,
a = pt_width / bin_width,
b = pt_height / bin_width,
angle = 0
)) +
coord_equal((1 / pt_height) * pt_width)# to make the dot
设置箱宽度很灵活!
bin_width <- 2
# etc (same code as above)
现在,更详细地复制Hadley的图形实际上很有趣。 (尽管我以某种方式严重怀疑他是否使用ggplot创建了它!)。如果没有一些技巧,很多都是不可能的。最明显的是“十字”轴刻度,当然还有背景渐变(Baptiste helped)。
library(tidyverse)
library(grid)
library(ggforce)
p <-
ggplot(count_data) +
annotate(x= seq(80,104,4), y = -Inf, geom = 'text', label = '|') +
geom_ellipse(aes(
x0 = x,
y0 = y,
a = pt_width / bin_width,
b = pt_height / bin_width,
angle = 0
),
fill = "#E67D62",
size = 0
) +
scale_x_continuous(breaks = seq(80,104,4)) +
scale_y_continuous(expand = c(0,0.1)) +
theme_void() +
theme(axis.line.x = element_line(color = "black"),
axis.text.x = element_text(color = "black",
margin = margin(8,0,0,0, unit = 'pt'))) +
coord_equal((1 / pt_height) * pt_width, clip = 'off')
oranges <- c("#FEEAA9", "#FFFBE1")
g <- rasterGrob(oranges, width = unit(1, "npc"), height = unit(0.7, "npc"), interpolate = TRUE)
grid.newpage()
grid.draw(g)
print(p, newpage = FALSE)
由reprex package(v0.3.0)于2020-05-01创建