如何为多个可变数据的点添加曲线拟合线?

时间:2017-08-23 16:41:40

标签: r time ggplot2 comparison line-plot

我试图绘制预期值和实际值,随着时间的推移。我有一些数据,我想在一张图上得到所有数据。我仍然是R的新手,我一直陷入困境。

到目前为止,我已经能够在单独的图表上得到我想要的东西,或者如果我把它们全部放在一起,我似乎无法让它做我想要的。

我几乎就在那里,但我希望将点(点是预期值)与虚线连接起来。我尝试以几种不同的方式添加LOESS行(一个在我的代码中进行了哈希),但我一直在收到错误。

我还是R的新手(以及一般的编码),但我知道除了手动构建绘图之外,还有一种方法可以做到这一点。但是,我尝试的每个示例都会执行我想要的某些,但我似乎无法立即使用所有内容。我开始明白每件事情的作用,但有时候我迷失了什么。

  

xy.coords(x,y,xlabel,ylabel)中的错误:' x'是一个列表,但确实如此   没有组件' x'和'

     

错误:不知道如何将RHS添加到主题对象

我的情节:(没有连接链接)

My plot, without links connected

我的数据集

Year,SC_CE_5AGG,SC_ACA,TA_CE_5AGG,TA_ACA,OA_CE_5AGG,OA_ACA,CO_CE_5AGG,CO_ACA
2005,8,12,5,0,140,100,23,23
2006,,13,,0,,100,,25
2007,,13,,0,,102,,37
2008,,14,,0,,104,,36
2009,,16,,3,,104,,35
2010,10,17,6,4,179,106,29,36
2011,,20,,7,,111,,36
2012,,23,,7,,116,,33
2013,,22,,10,,118,,37
2014,,23,,12,,107,,40
2015,12,23,8,14,229,112,37,46
2016,,25,,14,,119,,56
2017,,28,,13,,120,,60
2018,,,,,,,,
2019,,,,,,,,
2020,16,,10,,292,,48,
2025,20,,20,,372,,61,

我的代码

setwd("C:Users/X/Documents/PROJECTS/R_RcW/Data")


install.packages("ggplot2")
install.packages("GGally")
library(ggplot2)
library(GGally)

ALL <- read.csv(file="Rcw_data.csv", header = TRUE)

#To plot multiple lines, (for a small number of variables) you can use build up the plot manually yourself
ggplot(data=ALL, aes(Year)) + 
   geom_line(aes(y = SC_ACA, colour = "Shoal Creek")) + 
   lines(scatter.smooth(aes(y = SC_CE_5AGG, colour = "Shoal Creek"))) + 
   geom_line(aes(y = TA_ACA, colour = "Talladega")) +
   lines(scatter.smooth(aes(y = TA_CE_5AGG, colour = "Talladega"))) +
   geom_line(aes(y = OA_ACA, colour = "Oakmulgee")) + 
   lines(scatter.smooth(aes(y = OA_CE_5AGG, colour = "Oakmulgee"))) + 
   geom_line(aes(y = CO_ACA, colour = "Conecuh")) +
   lines(scatter.smooth(aes(y = CO_CE_5AGG, colour = "Conecuh"))) +
  #lines(lowess(SC_CE_5AGG), col="Shoal Creek") +  # lowess line (x,y) 
  #lines(lowess(TA_CE_5AGG), col="Talladega") +  # lowess line (x,y)
  #lines(lowess(OA_CE_5AGG), col="Oakmulgee") + # lowess line (x,y)
  #lines(lowess(CO_CE_5AGG), col="Conecuh") # lowess line (x,y)

  theme_classic() +
  ggtitle("Active clusters of Red-cockaded Woodpeckers") +  
  theme(plot.title = element_text(hjust = 0.5)) +
  labs(colour="District") + 
  theme(legend.title.align=0.5) +
  theme(panel.border = element_rect(colour = "black", fill=NA, size=)) +
  scale_x_continuous(limits=c(2005, 2025), breaks=c(2005,2010,2015,2020,2025)) +
  xlab("Year") + ylab("Number of active clusters")   

1 个答案:

答案 0 :(得分:0)

我认为您最好将数据重新整形为长格式,例如:

library(tidyverse)
library(reshape2)

数据

structure(list(Year = c(2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 
   2011L, 2012L, 2013L, 2014L, 2015L, 2016L, 2017L, 2018L, 2019L, 
   2020L, 2025L), SC_CE_5AGG = c(8L, NA, NA, NA, NA, 10L, NA, NA, 
   NA, NA, 12L, NA, NA, NA, NA, 16L, 20L), SC_ACA = c(12L, 13L, 
   13L, 14L, 16L, 17L, 20L, 23L, 22L, 23L, 23L, 25L, 28L, NA, NA, 
   NA, NA), TA_CE_5AGG = c(5L, NA, NA, NA, NA, 6L, NA, NA, NA, NA, 
   8L, NA, NA, NA, NA, 10L, 20L), TA_ACA = c(0L, 0L, 0L, 0L, 3L, 
   4L, 7L, 7L, 10L, 12L, 14L, 14L, 13L, NA, NA, NA, NA), OA_CE_5AGG = c(140L, 
   NA, NA, NA, NA, 179L, NA, NA, NA, NA, 229L, NA, NA, NA, NA, 292L, 
   372L), OA_ACA = c(100L, 100L, 102L, 104L, 104L, 106L, 111L, 116L, 
   118L, 107L, 112L, 119L, 120L, NA, NA, NA, NA), CO_CE_5AGG = c(23L, 
   NA, NA, NA, NA, 29L, NA, NA, NA, NA, 37L, NA, NA, NA, NA, 48L, 
   61L), CO_ACA = c(23L, 25L, 37L, 36L, 35L, 36L, 36L, 33L, 37L, 
   40L, 46L, 56L, 60L, NA, NA, NA, NA)), .Names = c("Year", "SC_CE_5AGG", 
   "SC_ACA", "TA_CE_5AGG", "TA_ACA", "OA_CE_5AGG", "OA_ACA", "CO_CE_5AGG", 
   "CO_ACA"), class = "data.frame", row.names = c(NA, -17L))

  All %>% 
      melt(id="Year") %>% 
      na.omit() %>% 
      mutate(est =factor(grepl("5AGG", variable))) %>% 
      ggplot(aes(Year, value, color=variable, lty=est)) + 
      geom_line() +
      theme_classic() +
      ggtitle("Active clusters of Red-cockaded Woodpeckers") +  
      theme(plot.title = element_text(hjust = 0.5)) +
      labs(colour="District") + 
      theme(legend.title.align=0.5) +
      theme(panel.border = element_rect(colour = "black", fill=NA, size=)) +
      scale_x_continuous(limits=c(2005, 2025), 
                         breaks=c(2005,2010,2015,2020,2025)) +
      xlab("Year") + ylab("Number of active clusters")   

grepl用于定义估计值。