我需要使用名为df的数据框在ggplot2中绘制线形图,
DATE ITEM NUMBER_SOLD
<date> <chr> <int>
1 2018-01-08 APPLE 3
2 2018-01-09 APPLE 3
3 2018-01-09 PEAR 2
4 2018-01-09 ORANGE 1
5 2018-01-10 APPLE 2
6 2018-01-10 PEAR 1
7 2018-01-12 CHERRY 2
8 2018-01-12 MANGO 1
9 2018-01-15 PINEAPPLE 1
10 2018-01-15 APRICOT 1
等
数据框基本上是一个小标题,显示特定商品在2018年的某一天售出了336行。
该图必须是显示某项特定商品(苹果)的销售的折线图,其日期在x轴上,售出的数量在y轴上,另外一行在y轴上,表明销售额增加了15%像这样:
df %>% filter(ITEM == "APPLE") %>%
ggplot(aes(DATE, NUMBER_SOLD)) +
geom_line(size = 1, col = "red") +
theme(axis.text.x = element_text(angle = 90)) +
geom_line(aes(y = NUMBER_SOLD + NUMBER_SOLD/100*15), col = "green4", size = 1, alpha = 0.6) +
scale_x_date(date_labels="%b", date_breaks = "1 month")
但是,我还需要添加一个图例以显示两条线所代表的内容,例如红色线代表原始销售数量,绿色线代表原始销售数量+ 15%。我该如何实现?
答案 0 :(得分:2)
诀窍是先在数据框中进行计算,然后使用gather()
将数据转为长整数,然后将数字放入一列,并用另一个变量指示每个数字是用于实际销售还是预期销售。
library(tidyverse)
df <- tribble(~"DATE", ~"ITEM", ~"NUMBER_SOLD",
"2018-01-08", "APPLE", 3,
"2018-01-09", "APPLE", 3,
"2018-01-09", "PEAR", 2,
"2018-01-09", "ORANGE", 1,
"2018-01-10", "APPLE", 2,
"2018-01-10", "PEAR", 1,
"2018-01-12", "CHERRY", 2,
"2018-01-12", "MANGO", 1,
"2018-01-15", "PINEAPPLE", 1,
"2018-01-15", "APRICOT", 1) %>%
mutate(DATE = parse_date(DATE),
NUMBER_SOLD_EXP = NUMBER_SOLD + NUMBER_SOLD/100*15) %>%
gather(key = category, value = SOLD, NUMBER_SOLD, NUMBER_SOLD_EXP)
df
# A tibble: 20 x 4
DATE ITEM category SOLD
<date> <chr> <chr> <dbl>
1 2018-01-08 APPLE NUMBER_SOLD 3
2 2018-01-09 APPLE NUMBER_SOLD 3
3 2018-01-09 PEAR NUMBER_SOLD 2
4 2018-01-09 ORANGE NUMBER_SOLD 1
5 2018-01-10 APPLE NUMBER_SOLD 2
6 2018-01-10 PEAR NUMBER_SOLD 1
7 2018-01-12 CHERRY NUMBER_SOLD 2
8 2018-01-12 MANGO NUMBER_SOLD 1
9 2018-01-15 PINEAPPLE NUMBER_SOLD 1
10 2018-01-15 APRICOT NUMBER_SOLD 1
11 2018-01-08 APPLE NUMBER_SOLD_EXP 3.45
12 2018-01-09 APPLE NUMBER_SOLD_EXP 3.45
13 2018-01-09 PEAR NUMBER_SOLD_EXP 2.3
14 2018-01-09 ORANGE NUMBER_SOLD_EXP 1.15
15 2018-01-10 APPLE NUMBER_SOLD_EXP 2.3
16 2018-01-10 PEAR NUMBER_SOLD_EXP 1.15
17 2018-01-12 CHERRY NUMBER_SOLD_EXP 2.3
18 2018-01-12 MANGO NUMBER_SOLD_EXP 1.15
19 2018-01-15 PINEAPPLE NUMBER_SOLD_EXP 1.15
20 2018-01-15 APRICOT NUMBER_SOLD_EXP 1.15
现在,您只需要调用一次geom_line
,使用变量上的color参数指示该数字是实际出售还是预期出售。您需要添加scale_colour_manual()
来指定要附加到类别的颜色。
df %>% filter(ITEM == "APPLE") %>%
ggplot(aes(DATE, SOLD)) +
geom_line(aes(colour = category), size = 1) +
scale_colour_manual(values = c("NUMBER_SOLD" = "red", "NUMBER_SOLD_EXP" = "green")) +
theme(axis.text.x = element_text(angle = 90)) +
scale_x_date(date_labels="%b", date_breaks = "1 month")