Question

我正在尝试在直方图中为条形图中的颜色添加相应的标签。这是一个可重现的代码。

ggplot(aes(displ),data =mpg) + geom_histogram(aes(fill=class),binwidth = 1,col="black")

此代码给出直方图，并为直方图条的汽车“类”提供不同的颜色。但有没有什么方法可以在图表中的相应颜色中添加“类”的标签？

Answer 1

内置函数geom_histogram和stat_bin非常适合在ggplot中快速构建绘图。但是，如果您要进行更高级的样式，则通常需要在构建绘图之前创建数据。在您的情况下，您有重叠的标签，这些标签在视觉上很混乱。

以下代码为数据帧构建分箱频率表：

# Subset data
mpg_df <- data.frame(displ = mpg$displ, class = mpg$class)
melt(table(mpg_df[, c("displ", "class")]))

# Bin Data
breaks <- 1
cuts <- seq(0.5, 8, breaks)
mpg_df$bin <- .bincode(mpg_df$displ, cuts)

# Count the data
mpg_df <- ddply(mpg_df, .(mpg_df$class, mpg_df$bin), nrow)
names(mpg_df) <- c("class", "bin", "Freq")

您可以使用此新表来设置条件标签，因此只有在有超过一定数量的观察值时才会标记框：

ggplot(mpg_df, aes(x = bin, y = Freq,  fill = class)) +
  geom_bar(stat = "identity", colour = "black", width = 1) +
  geom_text(aes(label=ifelse(Freq >= 4, as.character(class), "")),
   position=position_stack(vjust=0.5), colour="black")

我认为复制标签很有意义，但显示每组的频率可能更有用：

ggplot(mpg_df, aes(x = bin, y = Freq,  fill = class)) +
  geom_bar(stat = "identity", colour = "black", width = 1) +
  geom_text(aes(label=ifelse(Freq >= 4, Freq, "")),
   position=position_stack(vjust=0.5), colour="black")

更新

我意识到你实际上可以使用内部ggplot函数..count..选择性地过滤标签。无需预先格式化数据！

ggplot(mpg, aes(x = displ, fill = class, label = class)) +
  geom_histogram(binwidth = 1,col="black") +
  stat_bin(binwidth=1, geom="text", position=position_stack(vjust=0.5), aes(label=ifelse(..count..>4, ..count.., "")))

这篇文章对于解释ggplot中的特殊变量非常有用：Special variables in ggplot (..count.., ..density.., etc.)

第二种方法仅在您想要使用计数标记数据集时才有效。如果要按类或其他参数标记数据集，则必须使用第一种方法预构建数据框。

Answer 2

查看您共享的其他stackoverflow链接中的示例，您需要做的就是更改vjust参数。

ggplot(mpg, aes(x = displ, fill = class, label = class)) +
  geom_histogram(binwidth = 1,col="black") +     
  stat_bin(binwidth=1, geom="text", vjust=1.5)

那就是说，看起来你还有其他问题。也就是说，标签堆叠在彼此之上，因为在每个点上没有很多观察结果。相反，我只是让人们使用图例来阅读图表。

如何在ggplot中标记堆积直方图

2 个答案:

更新