如何绘制分组列联表

时间:2017-06-11 21:30:40

标签: r grouping bar-chart

我正在使用名为UCBAdmissions的R内置数据集之一,并尝试创建一个分组的条形图,数据被强制转换为数据框,并按AdmitGender分组Dept(不使用ggplot)。

data(UCBAdmissions)
as.data.frame(UCBAdmissions)
      Admit Gender Dept Freq
1  Admitted   Male    A  512
2  Rejected   Male    A  313
3  Admitted Female    A   89
4  Rejected Female    A   19
5  Admitted   Male    B  353
6  Rejected   Male    B  207
7  Admitted Female    B   17
8  Rejected Female    B    8
9  Admitted   Male    C  120
10 Rejected   Male    C  205
11 Admitted Female    C  202
12 Rejected Female    C  391
13 Admitted   Male    D  138
14 Rejected   Male    D  279
15 Admitted Female    D  131
16 Rejected Female    D  244
17 Admitted   Male    E   53
18 Rejected   Male    E  138
19 Admitted Female    E   94
20 Rejected Female    E  299
21 Admitted   Male    F   22
22 Rejected   Male    F  351
23 Admitted Female    F   24
24 Rejected Female    F  317

我尝试以这种方式将数据转换为表格格式,但收到了错误消息。

> barplot(table(as.data.frame(UCBAdmissions)))
Error in barplot.default(table(as.data.frame(UCBAdmissions))) : 
  'height' must be a vector or a matrix

我发现此SO链接提供了非ggplot答案,但收到了上面显示的错误消息。

还有这个SO链接,但数据结构不同。

我希望数据只能用两个维度显示。这是简化的分组条形图的样子。

grouped barplot

1 个答案:

答案 0 :(得分:2)

我并不完全确定你要实现的目标,但我假设你想要按照Dept分组的条形图,而传说是性别和组合的组合。承认(只是为了提出这个想法)。

在您指向的条形图示例中,数据是纯数字矩阵,其中rownames和colnames设置为标签和分组。您需要首先转换数据(我使用dplyr中的tidyrtidyverse):

library(tidyverse)
df2 = group_by(as.data.frame(UCBAdmissions), Dept, Gender, Admit) %>% 
    summarise(Freq = sum(Freq)) %>%
    ungroup() %>%
    mutate(GA = paste(Gender, Admit)) %>%
    select(Dept, GA, Freq) %>%
    spread(key = Dept, value = Freq) %>%
    as.data.frame()
rownames(df2) = df2$GA
df2 = as.matrix(select(df2, -GA))

现在您的数据采用条形图可以使用的形式:

barplot(df2, beside=TRUE, legend = rownames(df2))

final bar plot