如何绘制因子的每个水平

时间:2020-04-11 16:35:56

标签: r ggplot2

我正在尝试绘制一个因子的每个级别都有其自己的序列的图。虽然我是R的长期用户,但我没有掌握一些最新的改进。例如,我还没有学过ggplot,它涉及一些相关问题,但是我还不能将我想做的事情翻译成ggplot。这是一个简单的示例:

#library(tidyverse) # uncomment if not loaded

in_data <- read_csv("http://www.nfgarland.ca/National_Custom_Data.csv")
in_data <- in_data %>% 
  mutate(Tot = in_data$`NUM INFLUENZA DEATHS`+in_data$`NUM PNEUMONIA DEATHS`) %>% 
  arrange(SEASON) %>%
  mutate(SEASON = factor(SEASON,ordered=TRUE)) 

filter(in_data,SEASON == "2015-16")$Tot %>% plot((1:length(.)),
                                             ., 
                                             type = "l",
                                             col = "red",
                                             xlab ="Flu Season Week",
                                             ylab = "Deaths",
                                             ylim = c(2000,7500))
filter(in_data,SEASON == "2016-17")$Tot %>% lines((1:length(.)),., col="orange")
filter(in_data,SEASON == "2017-18")$Tot %>% lines((1:length(.)),. ,col="blue")
filter(in_data,SEASON == "2018-19")$Tot %>% lines((1:length(.)),. ,col="green")
filter(in_data,SEASON == "2019-20")$Tot %>% lines((1:length(.)),., ,col="black")

` 如您所见,我已经学习了许多tidyverse概念,并且此代码可以正常工作。但是我认为,确实应该有一种方法可以在tidyverse中自动执行此操作,而不必分别定义每一个line(),并且我无法识别它。我确实知道如何处理调色板,所以颜色变化没有问题。还请注意,尽管以前季节有52周的数据,但在此文件中,当前流感季节的年份只有24周。

2 个答案:

答案 0 :(得分:3)

这样怎么样?

library(ggplot2)
ggplot(in_data, aes(x=WEEK,y=Tot, color = SEASON)) + 
  geom_line() + 
  labs(x = "Flu Season Week", y = "Deaths") +
  ylim(2000,7500) + 
  scale_color_manual(values = c("red","goldenrod","blue","orange","green"))

enter image description here

编辑:针对OP关于想要打破2019-20数据的评论,我们可以使用快速枢轴来填写缺失的值。

in_data %>% dplyr::select(SEASON,Tot,WEEK) %>%
  tidyr::pivot_wider(names_from = SEASON, values_from = Tot) %>%
  pivot_longer(cols = (-WEEK), names_to = "SEASON", values_to = "Tot") %>%
ggplot(aes(x=WEEK,y=Tot, color = SEASON)) + 
  geom_line() + 
  labs(x = "Flu Season Week", y = "Deaths") +
  ylim(2000,7500) + 
  scale_color_manual(values = c("red","goldenrod","blue","orange","green"))

enter image description here

答案 1 :(得分:0)

您需要使用一个for循环,当然,与ggplot2不同,您还必须指定图例。以下是您可以做的基于R的建议(过去的好日子):

library(readr)
library(dplyr)

COLS = c("red","goldenrod","blue","orange","green")
names(COLS) = levels(in_data$SEASON)

plot(NULL,xlim=range(in_data$WEEK),ylim=range(in_data$Tot),
xlab="time",ylab="Tot")
for(nu in levels(in_data$SEASON)){
lines(1:sum(in_data$SEASON == nu),
in_data$Tot[in_data$SEASON == nu],
col = COLS[nu])
}

legend("topright",fill=COLS,names(COLS))

enter image description here

如果您需要指定星期,因为就像您在评论中提到的那样,它的发生时间是从40周以上到明年。.这可能需要更多代码(可能会很痛苦)