R ggplot直方图条按降序排列

时间:2017-04-04 20:01:37

标签: r ggplot2 histogram

我不知道如何使直方图的条形图以ggplot的降序显示。

使用每个人都可以使用的数据框来代替我的代码:

library(ggplot2)
library(scales)


chol <- read.table(url("http://assets.datacamp.com/blog_assets/chol.txt"), 
header = TRUE)
ggplot(chol) +
geom_histogram(aes(x = AGE, y = ..ncount.., fill = ..ncount..),
               breaks=seq(20, 50, by = 2),
               col="red",
               alpha = .2) +
scale_fill_gradient("Percentage", low = "green", high = "red") +
scale_y_continuous(labels = percent_format()) +
labs(title="Histogram for Age") +
labs(x="Age", y="Percentage")

我想要的结果直方图按降序排列:

enter image description here

我试图在绘制之前订购AGE列:

## set the levels in order we want
Chol <- within(Chol, 
               AGE <- factor(AGE, 
                                  levels=names(sort(table(AGE), 
                                                    decreasing=TRUE)

当我使用ggplot和geom_histogram绘制AGE订单时出现错误。

2 个答案:

答案 0 :(得分:1)

首先,我必须说,如果你正在改变x轴,我认为这可能是一个非常令人困惑的情节;我想大多数人会认为年龄越来越大。

但如果这真的是你想做的事,geom_histogram()在这里真的没有帮助。最好自己做数据摘要,只需使用ggplot进行绘图。这是为您的情节生成数据的一种方法

# helper function
pairjoin <- function(x) paste(head(x,-1), tail(x,-1), sep="-")
# use the base hist() function to calculate BINs
dd <- with(hist(chol$AGE, breaks=seq(10, 60, by = 5), plot=FALSE), data.frame(N=counts, age=pairjoin(breaks), PCT=counts/sum(counts)))

现在有了我们需要的数据,我们可以画出情节

ggplot(dd) +
geom_bar(aes(reorder(age, -PCT), PCT, fill=PCT),
    col="red", alpha = .2, stat="identity") +
scale_fill_gradient("Percentage", low = "green", high = "red") +
scale_y_continuous(labels = percent_format()) +
labs(title="Histogram for Age") +
labs(x="Age", y="Percentage")

这将产生以下情节:

enter image description here

答案 1 :(得分:0)

虽然我不推荐这个,因为它会改变x轴的年龄,你可以根据年龄将数据分成新的组(使用cut函数),按频率对结果因子进行重新排序,然后将其绘制成条形图:

#Add a new column for the "bins"
chol <- chol %>% mutate(AGE2 = cut(chol$AGE,
                           breaks = seq(min(AGE), max(AGE), by = 2),
                           right = FALSE))

#Reorders the factor by count
chol$AGE3 <- reorder(chol$AGE2, chol$AGE, FUN = function(x) 100-length(x))

#Makes the chart
chol %>% filter(AGE >= 20 & AGE < 50) %>% #This and the cut replace breaks
ggplot() +
  geom_bar(aes(x = AGE3,
               y = ..count../max(..count..), #Gives same percents on y-axis
               fill = ..count..), #Gives same percents on the scale
               col = "red",
               alpha = .2) +
  scale_fill_gradient("Percentage", low = "green", high = "red") + 
  scale_y_continuous(labels = percent_format()) +
  labs(title = "Histogram for Age") +
  labs(x = "Age", y = "Percentage")

example output plot y轴百分比对此没有意义,因为某些组是100% - 100%的什么?

此外,您仍需要重新标记组。 [20,22]表示它包含大于或等于20且小于22的值(见Interval Notation Wikipedia Page)。