我有一个数据框,其中包含2周的数据,表明每天有多少乘客乘坐火车。每个观察包含3个值,日期,乘客数量和星期几。我想比较从前一周到本周(周一到周一,Tusday到周二等)的每一天的乘客。这是数据:
structure(list(total = structure(c(17455, 17456, 17457, 17458,
17459, 17460, 17461, 17462, 17463, 17464, 17465, 17466, 17467,
17468), class = "Date"), passengers = c(9299L, 9166L, 10234L,
10176L, 10098L, 2867L, 5416L, 9312L, 10555L, 10858L, 10169L,
9515L, 2679L, 5490L), dow = c("Monday", "Tuesday", "Wednesday",
"Thursday", "Friday", "Saturday", "Sunday", "Monday", "Tuesday",
"Wednesday", "Thursday", "Friday", "Saturday", "Sunday")), .Names =
c("total", "passengers", "dow"), class = "data.frame")
(创建报告的自动化系统使用日期的“总计”一词,我觉得有必要指出这一点,因为它可能令人困惑)。
当我创建一个ggplot时,它只映射条形图的1 y值而不是2个并排:
ggplot(x, aes(x=dow, y=passengers), fill=variable) +
geom_bar(stat = "identity", position = "dodge")
我已经看到reshape用于融合此类实例的数据,但是当我使用星期几作为id.vars值融化时,日期转换为科学记数法(小问题)但是ggplot找不到乘客变数(大问题)。
答案 0 :(得分:2)
有待解决的一些问题:
fill = variable
,但没有名为"变量"的变量在您的数据框中; 我首先要争论数据框:
library(dplyr)
df <- x %>%
mutate(week = format(total, "%V"),
dow = factor(dow, levels = c("Monday", "Tuesday", "Wednesday", "Thursday",
"Friday", "Saturday", "Sunday")))
> head(df)
total passengers dow week
1 2017-10-16 9299 Monday 42
2 2017-10-17 9166 Tuesday 42
3 2017-10-18 10234 Wednesday 42
4 2017-10-19 10176 Thursday 42
5 2017-10-20 10098 Friday 42
6 2017-10-21 2867 Saturday 42
这增加了一周&#34;周&#34;变量,前7个值取值42,接下来7取43.现在,星期一至星期日也订购了星期几。
ggplot(df,
aes(x = dow, y = passengers, fill = week)) +
geom_col(position = "dodge")
geom_col()
相当于geom_bar(stat = "identity")
,但需要更少的输入。