根据第三列中的事件更改图表上点的颜色

时间:2020-03-04 17:27:29

标签: r ggplot2

我正在基于Google Analytics(分析)数据创建一个简单的折线图,该数据跟踪每天的用户数量。我在折线图上添加了标记每个日期的点。代码看起来像这样...

ggplot(web_visit_vs_email_deployed.df,aes(x = date, y = users))+
 geom_line()+
 geom_point(aes(color = !is.na(Program)))+
 theme_tq() +
 labs(
   title = "Website Visits vs. Emails Deployed",
   x = "",
   y = "Users",
   color = "Email Deployed"
)

我想在部署营销电子邮件的日期更改点的颜色。我已将用于绘制以上内容的数据框与另一个包含电子邮件性能指标的数据框的“电子邮件日期”列结合在一起。结果是下表...

    date                users Program
    <dttm>              <dbl> <chr>  
  1 2020-01-01 00:00:00    80 NA     
  2 2020-01-02 00:00:00   183 NA     
  3 2020-01-03 00:00:00   176 NA     
  4 2020-01-04 00:00:00    86 NA     
  5 2020-01-05 00:00:00    87 NA     
  6 2020-01-06 00:00:00   164 NA     
  7 2020-01-07 00:00:00   177 NA     
  8 2020-01-08 00:00:00   136 NA     
  9 2020-01-09 00:00:00   515 HEA    
 10 2020-01-10 00:00:00   231 NA     
 # ... with 53 more rows

这现在基于两个不同颜色的“程序”列为“ NA”或“ HEA”创建两个单独的折线图。取而代之的是,我希望根据“程序”列为“ NA”或“ HEA”的情况而用不同颜色的点表示一行

编辑:更新了图和数据框

编辑2:还有更多解决方法。谢谢大家的帮助!

1 个答案:

答案 0 :(得分:0)

我认为最好的解决方案是创建一个虚拟变量列,其中1表示您确实发送了电子邮件,0表示您没有发送电子邮件。这样,虚拟变量列可用于颜色美观。

library(dplyr)
library(ggplot2)

web_visit_by_date.df = data.frame('date' = c('2020-01-01','2020-01-02','2020-01-03'), 'users' = c(80, 183, 176)) # replicating your data snippet

email_dates = c('2020-01-01', '2020-01-03') # the join didn't make sense to me, so I left this as a list

# making the dummy variable using ifelse. The argument is checking to see if the
# date column contains a value from the email_dates list. If yes, then that row 
# equals true and gets a 1 in the new email column. Otherwise, that row equals
# false and gets a 0 in the new email column. 

web_visit_by_date.df$email =  ifelse((c(web_visit_by_date.df$date %in% email_dates) == TRUE), 1, 0)

# all that's left is to set color = as.factor(email) in the aes argument

ggplot(web_visit_by_date.df,aes(x = date, y = users, color = as.factor(email)))+
 geom_line()+
 geom_point()+
 labs(
 title = "Website Visits vs. Emails Deployed",
 x = "",
 y = "Users"
)