我想制作一个条形图,其中包含来自两个时间点的数据叠加,'之前'和'之后'。
在每个时间点,参与者被问到两个问题('疼痛'以及'恐惧'),他们将通过说明分数为1,2或3来回答。 / p>
我现有的代码会在'之前显示数据的计数。时间点很好,但我似乎无法在'之后添加计数。数据。
这是一幅草图,展示了我想要用''之后的情节。添加了数据,黑条代表'之后'数据:
我想在ggplot2()中创建绘图,并且我尝试调整How to superimpose bar plots in R?中的代码,但我无法使其适用于分组数据。
非常感谢!
#DATA PREP
library(dplyr)
library(ggplot2)
library(tidyr)
df <- data.frame(before_fear=c(1,1,1,2,3),before_pain=c(2,2,1,3,1),after_fear=c(1,3,3,2,3),after_pain=c(1,1,2,3,1))
df <- df %>% gather("question", "answer_option") # Get the counts for each answer of each question
df2 <- df %>%
group_by(question,answer_option) %>%
summarise (n = n())
df2 <- as.data.frame(df2)
df3 <- df2 %>% mutate(time = factor(ifelse(grepl("before", question), "before", "after"),
c("before", "after"))) # change classes and split data into two data frames
df3$n <- as.numeric(df3$n)
df3$answer_option <- as.factor(df3$answer_option)
df3after <- df3[ which(df3$time=='after'), ]
df3before <- df3[ which(df3$time=='before'), ]
# CODE FOR 'BEFORE' DATA ONLY PLOT - WORKS
ggplot(df3before, aes(fill=answer_option, y=n, x=question)) + geom_bar(position="dodge", stat="identity")
# CODE FOR 'BEFORE' AND 'AFTER' DATA PLOT - DOESN'T WORK
ggplot(mapping = aes(x, y,fill)) +
geom_bar(data = data.frame(x = df3before$question, y = df3before$n, fill= df3before$index_value), width = 0.8, stat = 'identity') +
geom_bar(data = data.frame(x = df3after$question, y = df3after$n, fill=df3after$index_value), width = 0.4, stat = 'identity', fill = 'black') +
theme_classic() + scale_y_continuous(expand = c(0, 0))
答案 0 :(得分:2)
我认为线索是设置&#34;&#34;&#34;&#34;如果 它们的宽度为0.9(即&#34;&#34;条之前的宽度相同(默认)),则可以避开它们。另外,因为我们不会在&#34;&#34;之后映射 width
。我们需要使用fill
美学来实现躲避。
我更喜欢只有一个数据集,只是在每个group
的调用中将其子集化。
geom_col
数据:
ggplot(mapping = aes(x = question, y = n, fill = factor(ans))) +
geom_col(data = d[d$t == "before", ], position = "dodge") +
geom_col(data = d[d$t == "after", ], aes(group = ans),
fill = "black", width = 0.5, position = position_dodge(width = 0.9))
使用set.seed(2)
d <- data.frame(t = rep(c("before", "after"), each = 6),
question = rep(c("pain", "fear"), each = 3),
ans = 1:3, n = sample(12))
替代数据准备工作,从您原来的&#39; df&#39;开始:
data.table
预先计算计数并按上述library(data.table)
d <- melt(setDT(df), measure.vars = names(df), value.name = "ans")
d[ , c("t", "question") := tstrsplit(variable, "_")]
geom_col
或者让# d2 <- d[ , .N, by = .(question, ans)]
进行计数:
geom_bar
数据:
ggplot(mapping = aes(x = question, fill = factor(ans))) +
geom_bar(data = d[d$t == "before", ], position = "dodge") +
geom_bar(data = d[d$t == "after", ], aes(group = ans),
fill = "black", width = 0.5, position = position_dodge(width = 0.9))
答案 1 :(得分:0)
我的解决方案与@Henrik非常相似,但我想指出一些事情。
首先,您要在geom_col
内构建数据框,这可能比您需要的更加混乱。如果您已经创建了df3after
等,那么您也可以在ggplot
内使用它。
其次,我很难跟进你的整理。我认为有一些tidyr
函数可以让您更轻松地完成此任务,因此我采用了不同的方法,例如使用separate
创建time
和{{1而不是基本上手动搜索它们,使其更具可扩展性。这也可以让你忍受痛苦&#34;并且&#34;恐惧&#34;在你的x轴上,而不是仍然有&#34; before_pain&#34;和&#34; before_fear&#34;,一旦你有&#34;&#34;之后就不再是准确的表示。情节上的价值也是如此。但请随意忽略这一点并坚持自己的方法。
measure
我把它分成了之前和之后在数据集之后,就像你一样,然后用2 library(tidyverse)
df <- data.frame(before_fear = c(1,1,1,2,3),
before_pain = c(2,2,1,3,1),
after_fear = c(1,3,3,2,3),
after_pain = c(1,1,2,3,1))
df_long <- df %>%
gather(key = question, value = answer_option) %>%
mutate(answer_option = as.factor(answer_option)) %>%
count(question, answer_option) %>%
separate(question, into = c("time", "measure"), sep = "_", remove = F)
df_long
#> # A tibble: 12 x 5
#> question time measure answer_option n
#> <chr> <chr> <chr> <fct> <int>
#> 1 after_fear after fear 1 1
#> 2 after_fear after fear 2 1
#> 3 after_fear after fear 3 3
#> 4 after_pain after pain 1 3
#> 5 after_pain after pain 2 1
#> 6 after_pain after pain 3 1
#> 7 before_fear before fear 1 3
#> 8 before_fear before fear 2 1
#> 9 before_fear before fear 3 1
#> 10 before_pain before pain 1 2
#> 11 before_pain before pain 2 2
#> 12 before_pain before pain 3 1
s绘制它们。我仍然将geom_col
放入df_long
,将其视为假人,以获得统一的x和y美学。就像@Henrik所说的那样,你可以在ggplot
和width
中使用不同的geom_col
来避开宽度为90%的条形,但条形图本身的宽度只有40%
position_dodge
您可以而不是制作两个独立的数据框,而是在每个df_before <- df_long %>% filter(time == "before")
df_after <- df_long %>% filter(time == "after")
ggplot(df_long, aes(x = measure, y = n)) +
geom_col(aes(fill = answer_option),
data = df_before, width = 0.9,
position = position_dodge(width = 0.9)) +
geom_col(aes(group = answer_option),
data = df_after, fill = "black", width = 0.4,
position = position_dodge(width = 0.9))
内进行过滤。除非过滤更复杂,否则这通常是我的偏好。此代码将获得与上面相同的图。
geom_col