I have a dataset with dates in one field and N/As in another. I created this as a subset of a larger dataset because I need to see whether the number of N/As are from one time period or more evenly distributed across all time.
my data looks like this:
User_id | Date | app_version
001 | 2016-01-03 | <NA>
002 | 2016-03-03 | <NA>
003 | 2016-02-22 | <NA>
004 | 2016-04-15 | <NA>
...
What I'd like to do is plot a line graph with time on the X axis and number of NAs on the Y axis.
Thanks in advance.
答案 0 :(得分:1)
使用dplyr
和ggplot2
:相应地对数据进行分组,汇总并计算NA值的数量,然后绘制。 (在这种情况下,我按Date
分组并添加geom_point
以显示每个日期。)
library(dplyr)
library(ggplot2)
df %>%
group_by(Date) %>%
summarize(na_count = sum(is.na(app_version))) %>%
ggplot(aes(x = Date, y = na_count)) +
geom_line() +
geom_point()
答案 1 :(得分:0)
您的数据库
class
你的图表
User_id<-c("001","002","003","004")
Date<-c("2016-01-03","2016-03-03","2016-02-22","2016-04-15")
app_version<-c(NA,NA,NA,NA)
db<-data.frame(cbind(User_id,Date,app_version))
答案 2 :(得分:0)
library(plyr)
#create a field that breaks the dates down to just year & month
#You can break it down by year if you'd like
df$yr_mth<-substr(df$Date, 1, 7)
#summarize the number of NAs per year_month
df1<-ddply(df, .(yr_mth), summarize,
num_na=length(which(is.na(app_version))))
#plot yr_mth on x, num_na on y
ggplot(data=df1, aes(x=as.Date(yr_mth), y=num_na))+
geom_point()