我是R的新手,我正在尝试为我的数据集中给定变量的中位数浓度创建一个时间序列图。但是我没有得到我想要的东西,我不明白我做错了什么。一旦我创建了新的数据帧(data_median),一些数据就会出现N / A.这使图表不完整..我将非常感谢任何帮助!提前谢谢!
预览我的代码:
if (!require(pacman)) {
install.packages('pacman')
}
pacman::p_load("ggplot2","tidyr","plyr","dplyr")
#### Read in the necessary data ######
roadsalt_data<-read.table("QADportaldata_1988-2015.tsv",header=T,sep="\t",fill=T,stringsAsFactors = F)
#Convert date column from a character class to a date class so ggplot can display as a continuous variable ###
roadsalt_data$stdate <- as.Date(roadsalt_data$stdate)
## Filter dataset to only contain columns I need ########
filtered_param <- roadsalt_data %>%
select(orgid, stdate,locid, charnam,val) %>%
filter(between(stdate, as.Date("1996-01-01"), as.Date("2015-07-01"))) %>%
filter(charnam == "Chloride")
filtered_param$val <- as.numeric(as.character(filtered_param$val))
data_median<-
filtered_param %>%
mutate(year=as.Date(cut(stdate, breaks = "year"))) %>%
group_by(year) %>%
summarize(xmedian = median(val))
## theme for plots ####
graph_theme<- theme_linedraw()+
theme(plot.title=element_text(size=15, face="bold",vjust=0.5,hjust = 0.5),
legend.text=element_text(size=10, face="bold"))
graph1<-ggplot(data_median, aes(year, xmedian)) +
geom_line(color="blue") +
scale_x_date(date_labels = "%Y", date_breaks = "2 year") +
ggtitle("Median Chloride Concentration (mg/L);1997-\n2015") +
xlab("Date") + ylab("Median Chloride Concentration") +
graph_theme
预览我的数据集:
A tibble: 16,209 x 5
orgid stdate locid charnam val
<chr> <date> <chr> <chr> <chr>
1 USGS-NJ 2014-11-20 USGS-01482500 Chloride 23.6
2 USGS-NJ 2015-06-24 USGS-0146453250 Chloride 221
3 USGS-NJ 2014-09-15 USGS-01392150 Chloride 144
4 USGS-NJ 2015-05-28 USGS-01411035 Chloride 10.8
5 USGS-NJ 2015-06-16 USGS-01411466 Chloride 10.5
6 USGS-NJ 2015-06-16 USGS-01411444 Chloride 5.76
7 USGS-NJ 2015-06-16 USGS-01463500 Chloride 27.5
8 USGS-NJ 2015-06-16 USGS-01407821 Chloride 37.1
9 USGS-NJ 2015-06-02 USGS-01464527 Chloride 22.5
10 USGS-NJ 2015-06-02 USGS-01405340 Chloride 78.3