在ggplot2中叠加/叠加分组条形图

时间:2018-05-27 17:01:32

标签: r ggplot2

我想制作一个条形图,其中包含来自两个时间点的数据叠加,'之前'和'之后'。

在每个时间点,参与者被问到两个问题('疼痛'以及'恐惧'),他们将通过说明分数为1,2或3来回答。 / p>

我现有的代码会在'之前显示数据的计数。时间点很好,但我似乎无法在'之后添加计数。数据。

这是一幅草图,展示了我想要用''之后的情节。添加了数据,黑条代表'之后'数据:

enter image description here

我想在ggplot2()中创建绘图,并且我尝试调整How to superimpose bar plots in R?中的代码,但我无法使其适用于分组数据。

非常感谢!

#DATA PREP
library(dplyr)
library(ggplot2)
library(tidyr)


df <- data.frame(before_fear=c(1,1,1,2,3),before_pain=c(2,2,1,3,1),after_fear=c(1,3,3,2,3),after_pain=c(1,1,2,3,1))


df <- df %>% gather("question", "answer_option") # Get the counts for each answer of each question 
df2 <- df  %>%
  group_by(question,answer_option) %>%
  summarise (n = n()) 
df2 <- as.data.frame(df2)


df3 <- df2 %>% mutate(time = factor(ifelse(grepl("before", question), "before", "after"),
                                        c("before", "after"))) # change classes and split data into two data frames
df3$n <- as.numeric(df3$n)
df3$answer_option <- as.factor(df3$answer_option)
df3after <- df3[ which(df3$time=='after'), ]
df3before <- df3[ which(df3$time=='before'), ]


# CODE FOR 'BEFORE' DATA ONLY PLOT - WORKS  
    ggplot(df3before, aes(fill=answer_option, y=n, x=question)) + geom_bar(position="dodge", stat="identity")



# CODE FOR 'BEFORE' AND 'AFTER' DATA PLOT - DOESN'T WORK
ggplot(mapping = aes(x, y,fill)) +
  geom_bar(data = data.frame(x = df3before$question, y = df3before$n, fill= df3before$index_value), width = 0.8, stat = 'identity') +
  geom_bar(data = data.frame(x = df3after$question, y = df3after$n, fill=df3after$index_value), width = 0.4, stat = 'identity', fill = 'black') +
  theme_classic() + scale_y_continuous(expand = c(0, 0))

2 个答案:

答案 0 :(得分:2)

我认为线索是设置&#34;&#34;&#34;&#34;如果 它们的宽度为0.9(即&#34;&#34;条之前的宽度相同(默认)),则可以避开它们。另外,因为我们不会在&#34;&#34;之后映射 width。我们需要使用fill美学来实现躲避。

我更喜欢只有一个数据集,只是在每个group的调用中将其子集化。

geom_col

enter image description here

数据:

ggplot(mapping = aes(x = question, y = n, fill = factor(ans))) +
  geom_col(data = d[d$t == "before", ], position = "dodge") +
  geom_col(data = d[d$t == "after", ], aes(group = ans),
           fill = "black", width = 0.5, position = position_dodge(width = 0.9))

使用set.seed(2) d <- data.frame(t = rep(c("before", "after"), each = 6), question = rep(c("pain", "fear"), each = 3), ans = 1:3, n = sample(12)) 替代数据准备工作,从您原来的&#39; df&#39;开始:

data.table

预先计算计数并按上述library(data.table) d <- melt(setDT(df), measure.vars = names(df), value.name = "ans") d[ , c("t", "question") := tstrsplit(variable, "_")]

进行
geom_col

或者让# d2 <- d[ , .N, by = .(question, ans)] 进行计数:

geom_bar

enter image description here

数据:

ggplot(mapping = aes(x = question, fill = factor(ans))) +
  geom_bar(data = d[d$t == "before", ], position = "dodge") +
  geom_bar(data = d[d$t == "after", ], aes(group = ans),
           fill = "black", width = 0.5, position = position_dodge(width = 0.9))

答案 1 :(得分:0)

我的解决方案与@Henrik非常相似,但我想指出一些事情。

首先,您要在geom_col内构建数据框,这可能比您需要的更加混乱。如果您已经创建了df3after等,那么您也可以在ggplot内使用它。

其次,我很难跟进你的整理。我认为有一些tidyr函数可以让您更轻松地完成此任务,因此我采用了不同的方法,例如使用separate创建time和{{1而不是基本上手动搜索它们,使其更具可扩展性。这也可以让你忍受痛苦&#34;并且&#34;恐惧&#34;在你的x轴上,而不是仍然有&#34; before_pain&#34;和&#34; before_fear&#34;,一旦你有&#34;&#34;之后就不再是准确的表示。情节上的价值也是如此。但请随意忽略这一点并坚持自己的方法。

measure

我把它分成了之前和之后在数据集之后,就像你一样,然后用2 library(tidyverse) df <- data.frame(before_fear = c(1,1,1,2,3), before_pain = c(2,2,1,3,1), after_fear = c(1,3,3,2,3), after_pain = c(1,1,2,3,1)) df_long <- df %>% gather(key = question, value = answer_option) %>% mutate(answer_option = as.factor(answer_option)) %>% count(question, answer_option) %>% separate(question, into = c("time", "measure"), sep = "_", remove = F) df_long #> # A tibble: 12 x 5 #> question time measure answer_option n #> <chr> <chr> <chr> <fct> <int> #> 1 after_fear after fear 1 1 #> 2 after_fear after fear 2 1 #> 3 after_fear after fear 3 3 #> 4 after_pain after pain 1 3 #> 5 after_pain after pain 2 1 #> 6 after_pain after pain 3 1 #> 7 before_fear before fear 1 3 #> 8 before_fear before fear 2 1 #> 9 before_fear before fear 3 1 #> 10 before_pain before pain 1 2 #> 11 before_pain before pain 2 2 #> 12 before_pain before pain 3 1 s绘制它们。我仍然将geom_col放入df_long,将其视为假人,以获得统一的x和y美学。就像@Henrik所说的那样,你可以在ggplotwidth中使用不同的geom_col来避开宽度为90%的条形,但条形图本身的宽度只有40%

position_dodge

您可以而不是制作两个独立的数据框,而是在每个df_before <- df_long %>% filter(time == "before") df_after <- df_long %>% filter(time == "after") ggplot(df_long, aes(x = measure, y = n)) + geom_col(aes(fill = answer_option), data = df_before, width = 0.9, position = position_dodge(width = 0.9)) + geom_col(aes(group = answer_option), data = df_after, fill = "black", width = 0.4, position = position_dodge(width = 0.9)) 内进行过滤。除非过滤更复杂,否则这通常是我的偏好。此代码将获得与上面相同的图。

geom_col