分组条形图y轴 - ggplot

时间:2018-02-09 17:42:45

标签: r ggplot2

我正在制作一个分组的条形图,希望通过三个不同考试的学习时数来显示学生的最终成绩范围。

到目前为止,我已经成功构建了这些内容。忽略y轴和实际的条形,因为它们不正确。我想要的是y轴是在给定学习时数内获得给定成绩的学生人数。

enter image description here

以下是我使用的代码:

ggplot(P1, aes (x=Hours,y=Bins)) +
  geom_bar(aes(fill= Bins),stat="identity", position= "dodge")+
  facet_grid(Exam~.)+
  scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs", "8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) + 
  labs (x= "Hours Spent Studying", y="# of Students")

我很确定问题出在我的数据输入上,如下所示:

ID Exam     Hours    Bins
1 S001    1   0-1 hrs  61-70%
2 S002    1   4-5 hrs  51-60%
3 S003    1 12-13 hrs  51-60%
4 S004    1   6-7 hrs 91-100%
5 S005    1   6-7 hrs  81-90%
6 S006    1 12-13 hrs  61-70%

我想我需要添加一个" Count"列用作y轴。但是,我很困惑,因为在制作普通的非分组条形图时我并不需要这样做。这是解决方案吗?如果是这样,我该如何添加计数列?

我是R的初学者。

以下是我的数据框的输入:

> dput(P1)
structure(list(ID = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 
22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 
35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 
48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 
61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 
74L, 75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 
87L, 88L, 89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 
29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 
42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 
55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 
68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 76L, 77L, 78L, 79L, 80L, 
81L, 82L, 83L, 84L, 85L, 86L, 87L, 88L, 89L, 90L, 91L, 92L, 93L, 
94L, 95L, 96L, 97L, 98L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 
23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 
36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 
49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 
62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 
75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L, 
88L, 89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L), .Label = c("S001", 
"S002", "S003", "S004", "S005", "S006", "S007", "S008", "S009", 
"S010", "S011", "S012", "S013", "S014", "S015", "S016", "S017", 
"S018", "S019", "S020", "S021", "S022", "S023", "S024", "S025", 
"S026", "S027", "S028", "S029", "S030", "S031", "S032", "S033", 
"S034", "S035", "S036", "S037", "S038", "S039", "S040", "S041", 
"S042", "S043", "S044", "S045", "S046", "S047", "S048", "S049", 
"S050", "S051", "S052", "S053", "S054", "S055", "S056", "S057", 
"S058", "S059", "S060", "S061", "S062", "S063", "S064", "S065", 
"S066", "S067", "S068", "S069", "S070", "S071", "S072", "S073", 
"S074", "S075", "S076", "S077", "S078", "S079", "S080", "S081", 
"S082", "S083", "S084", "S085", "S086", "S087", "S088", "S089", 
"S090", "S091", "S092", "S093", "S094", "S095", "S096", "S097", 
"S098"), class = "factor"), Exam = c("1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
"2", "2", "2", "2", "2", "2", "2", "2", "2", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", 
"3", "3", "3"), Hours = c("0-1 hrs", "4-5 hrs", "12-13 hrs", 
"6-7 hrs", "6-7 hrs", "12-13 hrs", "14+ hrs", "6-7 hrs", "2-3 hrs", 
"6-7 hrs", "4-5 hrs", "8-9 hrs", "6-7 hrs", "12-13 hrs", "2-3 hrs", 
"12-13 hrs", "2-3 hrs", "4-5 hrs", "10-11 hrs", "0-1 hrs", "4-5 hrs", 
"10-11 hrs", "4-5 hrs", "8-9 hrs", "0-1 hrs", "12-13 hrs", "2-3 hrs", 
"6-7 hrs", "6-7 hrs", "10-11 hrs", "10-11 hrs", "6-7 hrs", "10-11 hrs", 
"12-13 hrs", "6-7 hrs", "6-7 hrs", "14+ hrs", "2-3 hrs", "4-5 hrs", 
"6-7 hrs", "4-5 hrs", "4-5 hrs", "8-9 hrs", "8-9 hrs", "2-3 hrs", 
"14+ hrs", "2-3 hrs", "2-3 hrs", "8-9 hrs", "8-9 hrs", "6-7 hrs", 
"14+ hrs", "8-9 hrs", "10-11 hrs", "10-11 hrs", "8-9 hrs", "2-3 hrs", 
"8-9 hrs", "8-9 hrs", "4-5 hrs", "2-3 hrs", "4-5 hrs", "2-3 hrs", 
"4-5 hrs", "4-5 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs", 
"4-5 hrs", "8-9 hrs", "14+ hrs", "0-1 hrs", "4-5 hrs", "10-11 hrs", 
"4-5 hrs", "4-5 hrs", "8-9 hrs", "4-5 hrs", "12-13 hrs", "4-5 hrs", 
"6-7 hrs", "8-9 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs", "6-7 hrs", 
"4-5 hrs", "10-11 hrs", "4-5 hrs", "4-5 hrs", "10-11 hrs", "12-13 hrs", 
"4-5 hrs", "6-7 hrs", "4-5 hrs", "4-5 hrs", "12-13 hrs", "2-3 hrs", 
"6-7 hrs", "10-11 hrs", "6-7 hrs", "8-9 hrs", "6-7 hrs", "8-9 hrs", 
"2-3 hrs", "4-5 hrs", "8-9 hrs", "4-5 hrs", "4-5 hrs", "6-7 hrs", 
"6-7 hrs", "6-7 hrs", "14+ hrs", "4-5 hrs", "10-11 hrs", "6-7 hrs", 
"0-1 hrs", "4-5 hrs", "12-13 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs", 
"12-13 hrs", "2-3 hrs", "4-5 hrs", "10-11 hrs", "8-9 hrs", "4-5 hrs", 
"8-9 hrs", "8-9 hrs", "14+ hrs", "8-9 hrs", "8-9 hrs", "4-5 hrs", 
"0-1 hrs", "4-5 hrs", "6-7 hrs", "4-5 hrs", "6-7 hrs", "8-9 hrs", 
"8-9 hrs", "2-3 hrs", "12-13 hrs", "4-5 hrs", "2-3 hrs", "6-7 hrs", 
"10-11 hrs", "6-7 hrs", "12-13 hrs", "8-9 hrs", "8-9 hrs", "10-11 hrs", 
"8-9 hrs", "2-3 hrs", "8-9 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs", 
"4-5 hrs", "2-3 hrs", "4-5 hrs", "8-9 hrs", "6-7 hrs", "4-5 hrs", 
"8-9 hrs", "8-9 hrs", "2-3 hrs", "10-11 hrs", "10-11 hrs", "0-1 hrs", 
"2-3 hrs", "6-7 hrs", "6-7 hrs", "4-5 hrs", "8-9 hrs", "2-3 hrs", 
"12-13 hrs", "0-1 hrs", "4-5 hrs", "6-7 hrs", "2-3 hrs", "2-3 hrs", 
"8-9 hrs", "6-7 hrs", "2-3 hrs", "2-3 hrs", "2-3 hrs", "4-5 hrs", 
"8-9 hrs", "6-7 hrs", "10-11 hrs", "8-9 hrs", "4-5 hrs", "6-7 hrs", 
"12-13 hrs", "2-3 hrs", "4-5 hrs", "14+ hrs", "6-7 hrs", "8-9 hrs", 
"6-7 hrs", "14+ hrs", "10-11 hrs", "4-5 hrs", "4-5 hrs", "6-7 hrs", 
"6-7 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs", "14+ hrs", "6-7 hrs", 
"6-7 hrs", "2-3 hrs", "2-3 hrs", "2-3 hrs", "14+ hrs", "4-5 hrs", 
"6-7 hrs", "0-1 hrs", "12-13 hrs", "2-3 hrs", "10-11 hrs", "10-11 hrs", 
"4-5 hrs", "6-7 hrs", "6-7 hrs", "10-11 hrs", "14+ hrs", "10-11 hrs", 
"6-7 hrs", "6-7 hrs", "2-3 hrs", "6-7 hrs", "14+ hrs", "4-5 hrs", 
"14+ hrs", "4-5 hrs", "10-11 hrs", "4-5 hrs", "8-9 hrs", "6-7 hrs", 
"4-5 hrs", "10-11 hrs", "6-7 hrs", "12-13 hrs", "14+ hrs", "6-7 hrs", 
"6-7 hrs", "10-11 hrs", "10-11 hrs", "4-5 hrs", "6-7 hrs", "10-11 hrs", 
"4-5 hrs", "6-7 hrs", "6-7 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs", 
"4-5 hrs", "6-7 hrs", "4-5 hrs", "6-7 hrs", "14+ hrs", "6-7 hrs", 
"12-13 hrs", "0-1 hrs", "10-11 hrs", "14+ hrs", "8-9 hrs", "6-7 hrs", 
"6-7 hrs", "6-7 hrs", "10-11 hrs", "12-13 hrs", "10-11 hrs", 
"10-11 hrs", "2-3 hrs", "2-3 hrs", "10-11 hrs", "6-7 hrs", "6-7 hrs", 
"6-7 hrs", "4-5 hrs", "6-7 hrs", "10-11 hrs", "6-7 hrs", "10-11 hrs", 
"4-5 hrs", "6-7 hrs", "10-11 hrs", "14+ hrs"), Bins = structure(c(3L, 
2L, 2L, 6L, 5L, 3L, 2L, 6L, 1L, 2L, 6L, 1L, 1L, 1L, 3L, 2L, 2L, 
3L, 2L, 5L, 2L, 1L, 6L, 3L, 4L, 6L, 6L, 4L, 4L, 4L, 5L, 3L, 5L, 
3L, 1L, 1L, 4L, 4L, 3L, 1L, 3L, 2L, 2L, 5L, 1L, 3L, 6L, 4L, 4L, 
5L, 1L, 4L, 2L, 3L, 3L, 2L, 4L, 3L, 3L, 4L, 6L, 1L, 6L, 3L, 4L, 
2L, 4L, 1L, 2L, 3L, 6L, 4L, 5L, 4L, 4L, 6L, 2L, 2L, 4L, 2L, 1L, 
5L, 5L, 3L, 2L, 4L, 4L, 4L, 1L, 5L, 4L, 4L, 2L, 1L, 2L, 2L, 1L, 
1L, 4L, 3L, 1L, 6L, 4L, 3L, 3L, 5L, 1L, 2L, 4L, 2L, 1L, 1L, 4L, 
1L, 1L, 3L, 3L, 6L, 1L, 1L, 6L, 2L, 6L, 6L, 4L, 1L, 4L, 4L, 5L, 
2L, 3L, 3L, 4L, 1L, 1L, 2L, 1L, 1L, 5L, 1L, 3L, 4L, 1L, 3L, 5L, 
3L, 4L, 5L, 1L, 4L, 3L, 6L, 1L, 6L, 5L, 1L, 2L, 1L, 3L, 1L, 6L, 
3L, 3L, 1L, 2L, 1L, 1L, 3L, 3L, 1L, 5L, 6L, 3L, 2L, 1L, 5L, 1L, 
1L, 1L, 2L, 3L, 5L, 6L, 1L, 3L, 2L, 3L, 2L, 1L, 2L, 5L, 2L, 4L, 
1L, 1L, 1L, 5L, 3L, 1L, 5L, 5L, 1L, 2L, 5L, 1L, 1L, 5L, 2L, 1L, 
2L, 5L, 3L, 1L, 4L, 3L, 5L, 1L, 3L, 5L, 2L, 4L, 5L, 4L, 2L, 3L, 
5L, 3L, 1L, 5L, 2L, 4L, 2L, 4L, 3L, 1L, 1L, 1L, 2L, 3L, 4L, 1L, 
3L, 5L, 4L, 4L, 4L, 2L, 4L, 3L, 3L, 3L, 2L, 4L, 1L, 6L, 1L, 5L, 
3L, 5L, 5L, 4L, 4L, 5L, 2L, 2L, 4L, 4L, 4L, 2L, 6L, 4L, 3L, 1L, 
5L, 1L, 1L, 3L, 1L, 4L, 5L, 5L, 5L, 4L, 2L, 4L, 3L, 1L, 2L, 4L, 
1L, 3L, 1L, 1L, 2L), .Label = c("50% or less", "51-60%", "61-70%", 
"71-80%", "81-90%", "91-100%"), class = "factor")), class = "data.frame", row.names = c(NA, 
-294L), .Names = c("ID", "Exam", "Hours", "Bins"))

2 个答案:

答案 0 :(得分:1)

您可以对数据框进行分组并对学生进行计数。

P2 <- with(P1, setNames(aggregate(ID, list(Hours, Bins, Exam), length),
                        c("Hours", "Bins", "Exam", "Count")))

或者使用dplyr

library(dplyr)
P2 <- P1 %>%
  group_by(Hours, Bins, Exam) %>%
  summarise(Count=n())

在情节中然后放上&#39; Count&#39;在y轴上并填充&#39; Bins&#39;。

library(ggplot2)
ggplot(P2, aes (x=Hours,y=Count)) +
  geom_bar(aes(fill= Bins), stat="identity", position= "dodge") +
  facet_grid(Exam~.) +
  scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs",
                            "8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) + 
  scale_y_discrete(limits = seq(2, 10, 2)) +
  labs (x= "Hours Spent Studying", y="# of Students")

这会产生这个:

enter image description here

最终这就是你想要的。

您还可以考虑堆积条形图,这可能稍微更具概要性。

blevels <- levels(P2$Bins)  # save levels for labeling below 

library(ggplot2)
ggplot(P2, aes (x=Hours,y=Count)) +
  geom_bar(aes(fill= as.numeric(Bins)), stat="identity", position= "stack") +
  facet_grid(Exam~.) +
  scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs",
                            "8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) + 
  labs(x= "Hours Spent Studying", y="# of Students", fill='Bins') +
  scale_fill_continuous(labels=blevels)

enter image description here

答案 1 :(得分:0)

这应该可以解决问题。请注意,在通过管道输入ggplot()之前,使用group_by()%&gt;%summarize()计算计数。

P1 %>% group_by(Exam, Hours, Bins) %>%
    summarize(Count = n()) %>%
    ggplot(aes(x=Hours,y=Count)) +
            geom_bar(aes(fill = Bins),stat="identity", position= "dodge")+
            facet_grid(Exam ~ .) +
            scale_x_discrete(limits=c("0-1 hrs", "2-3 hrs", "4-5 hrs", "6-7 hrs", "8-9 hrs", "10-11 hrs", "12-13 hrs", "14+ hrs")) + 
            labs (x= "Hours Spent Studying", y="# of Students")

抱歉,我不够酷,不能在SO上发布图片:)