r,ggplot,以HH:MM格式设置x轴,而不会丢失时间段

时间:2019-03-22 06:07:22

标签: r ggplot2

我将OpenVPN用作工作中的VPN服务器。我添加了很多其他功能。我已经将用户数据使用情况记录到了这样的一个MySQL数据库中。

id              
username        
IP              
DateSessionStart
TimeSessionStart
SessionId       
DateLastUpdate  
TimeLastUpdate  
UserUploaded    
UserDownloaded  

“ SessionId”是会话开始的时间。一个用户一天可能有多个会话,因此我不想将{aes“” color“属性设置为” username“的情况geom_line使用,因为这些行是从一个会话结束时开始加入的到下一个开始,例如下面的一个。 “ LW ...”的第一次会话是短暂的,但是它的结尾与下一个会话的开始相连。该用户是一个糟糕的例子。我的确遇到了另一个在午夜之前登录的用户,因此他们的会话以几个GB结束,然后他们在同一天又进行了另一个会话,因此在下一个会话开始时有一条长长的负斜线,显示为“ 0 MB”。

geom_line with color = username

我的这张图非常接近我想要的样子。

geom_line with color = SessionIdFac

我想改变两件事。

首先,我希望x轴从午夜开始,在下一个午夜结束。当没有人连接时,我不希望有差距。某些scale_x_...的添加似乎适用,但它们在不同的日期之间没有明显的作用,有时x轴标签每天都在变化,例如列出带有HH:MM的月份和日期,并且仅列出另一个HH:MM。

第二,我希望每个会话都用自己的颜色来标识,但是我希望图例显示会话的用户名,而不是SessionId值。我不确定如何更改“图例”的值。我使用ggrepel::geom_label_repel为每个不同的会话生成了标签,然后paste分别输入了用户名,使用的总数据(下载和上传的总和),换行符以及“ SessionId”的最后四位数字。旁注,我使用mutate来创建“ SessionIdFac”作为“ SessionId”的一个因素,因为使用color=SessionId(其中“ SessionId”是一个整数)在图例中生成了连续阴影的颜色值。

理想情况下,我会在每个会话的图例中使用用户名,例如蓝色矩形示例。对于红色矩形部分,我会没事的,其中后四位数字用stringr进行了子字符串化。

这是完整的脚本。

require(RMySQL)
require(tidyverse)
require(stringr)

con = dbConnect(MySQL(),
    user     = "OpenVpnBandwidthUsage",
    password = "OpenVpnBandwidthUsage",
    dbname   = "OvpnDb",
    host     = "172.16.2.100")

on.exit(dbDisconnect(con))

(DateOfVpnTransactions = Sys.Date()-7)

SqlStatement = paste(sep='',
  "select " ,
    "username , " ,
    "IP , " ,
    "DateSessionStart , " ,
    "TimeSessionStart ," ,
    "SessionId ," , 
    "DateLastUpdate ," , 
    "TimeLastUpdate," , 
    "UserUploaded + UserDownloaded as 'TotalBandwidth' " ,
  "from Bandwidth " , 
  "where DateLastUpdate = '" , DateOfVpnTransactions , "' " ,
  "order by DateLastUpdate , TimeLastUpdate , username;"
) ; SqlStatement

results = dbSendQuery(con, SqlStatement)
data = fetch(results , n=-1)
huh = dbHasCompleted(results)
dbClearResult(results)
dbDisconnect(con)

EndpointLabels = data %>% 
  group_by(SessionId) %>% 
  mutate(label = if_else(TimeLastUpdate == max(TimeLastUpdate), paste(sep='', username, ': ', TotalBandwidth, ' MB', '\n[', str_sub(SessionId,-4), ']'), NA_character_)) %>% 
  as.data.frame() %>% 
  select(label)

DATA = data %>% 
  mutate(
    label_flag         = EndpointLabels$label,
    username           = as.factor(username),
    IP                 = as.factor(IP),
    SessionIdFac       = as.factor(SessionId),
    DateLastUpdate     = as.Date(DateLastUpdate),
    DateTimeLastUpdate = as.POSIXct(paste(DateLastUpdate, TimeLastUpdate), tz='EST')
    )

(MaxData=max(DATA$TotalBandwidth))

cat('Start:' , '\t', format(as.POSIXct(paste(DateOfVpnTransactions, '00:00:00')), '%D %r'))
cat('End:'   , '\t', format(as.POSIXct(paste(DateOfVpnTransactions, '23:59:59')), '%D %r'))

DATA %>%
  ggplot(aes(x=DateTimeLastUpdate, y=TotalBandwidth)) + 
  geom_line(aes(color=SessionIdFac), show.legend = TRUE) + 
  geom_point(aes(color=SessionIdFac), show.legend = TRUE) + 
  ggrepel::geom_label_repel(aes(label=label_flag), na.rm=TRUE) +
  labs(
    title    = 'OpenVPN Bandwidth usage',
    subtitle = 'Separated by discrete session',
    x        = 'Time of Day',
    y        = 'Bandwidth used',
    color    = 'Session'
  ) +
  theme(axis.text.x = element_text(angle = 90 , hjust = 1))

ggsave(paste(sep='', DateOfVpnTransactions, '.png'), device='png', path='c:\\temp\\', width=8, height=5)

编辑:包括一些请求的信息。

DATA %>% str ; DATA %>% head(20) ; DATA %>% tail(20)
'data.frame':   569 obs. of  11 variables:
 $ username          : Factor w/ 3 levels "cr...","rg...",..: 3 2 3 2 3 2 2 3 3 2 ...
 $ IP                : Factor w/ 3 levels "1.2.3.158",..: 1 3 1 3 1 3 3 1 1 3 ...
 $ DateSessionStart  : chr  "2019-03-21" "2019-03-21" "2019-03-21" "2019-03-21" ...
 $ TimeSessionStart  : chr  "21:01:00" "22:14:39" "21:01:00" "22:14:39" ...
 $ SessionId         : int  1553216460 1553220879 1553216460 1553220879 1553216460 1553220879 1553220879 1553216460 1553216460 1553220879 ...
 $ DateLastUpdate    : Date, format: "2019-03-22" "2019-03-22" "2019-03-22" "2019-03-22" ...
 $ TimeLastUpdate    : chr  "00:00:44" "00:00:53" "00:01:50" "00:01:53" ...
 $ TotalBandwidth    : num  140 2 140 2 140 2 2 140 140 2 ...
 $ label_flag        : chr  NA NA NA NA ...
 $ SessionIdFac      : Factor w/ 7 levels "1553216460","1553220879",..: 1 2 1 2 1 2 2 1 1 2 ...
 $ DateTimeLastUpdate: POSIXct, format: "2019-03-22 00:00:44" "2019-03-22 00:00:53" "2019-03-22 00:01:50" "2019-03-22 00:01:53" ...

    username  IP         DateSessionStart  TimeSessionStart  SessionId   DateLastUpdate  TimeLastUpdate  TotalBandwidth  label_flag             SessionIdFac  DateTimeLastUpdate
1   rs...     1.2.3.158  2019-03-21        21:01:00          1553216460  2019-03-22      00:00:44        140             <NA>                   1553216460    2019-03-22 00:00:44
2   rg...     2.3.4.242  2019-03-21        22:14:39          1553220879  2019-03-22      00:00:53          2             <NA>                   1553220879    2019-03-22 00:00:53
3   rs...     1.2.3.158  2019-03-21        21:01:00          1553216460  2019-03-22      00:01:50        140             <NA>                   1553216460    2019-03-22 00:01:50
4   rg...     2.3.4.242  2019-03-21        22:14:39          1553220879  2019-03-22      00:01:53          2             <NA>                   1553220879    2019-03-22 00:01:53
5   rs...     1.2.3.158  2019-03-21        21:01:00          1553216460  2019-03-22      00:02:50        140             <NA>                   1553216460    2019-03-22 00:02:50
6   rg...     2.3.4.242  2019-03-21        22:14:39          1553220879  2019-03-22      00:02:54          2             <NA>                   1553220879    2019-03-22 00:02:54
7   rg...     2.3.4.242  2019-03-21        22:14:39          1553220879  2019-03-22      00:03:55          2             <NA>                   1553220879    2019-03-22 00:03:55
8   rs...     1.2.3.158  2019-03-21        21:01:00          1553216460  2019-03-22      00:03:55        140             <NA>                   1553216460    2019-03-22 00:03:55
9   rs...     1.2.3.158  2019-03-21        21:01:00          1553216460  2019-03-22      00:04:40        140             <NA>                   1553216460    2019-03-22 00:04:40
10  rg...     2.3.4.242  2019-03-21        22:14:39          1553220879  2019-03-22      00:04:55          2             <NA>                   1553220879    2019-03-22 00:04:55
11  rs...     1.2.3.158  2019-03-21        21:01:00          1553216460  2019-03-22      00:05:50        140             <NA>                   1553216460    2019-03-22 00:05:50
12  rg...     2.3.4.242  2019-03-21        22:14:39          1553220879  2019-03-22      00:05:56          2             <NA>                   1553220879    2019-03-22 00:05:56
13  rs...     1.2.3.158  2019-03-21        21:01:00          1553216460  2019-03-22      00:06:55        140             <NA>                   1553216460    2019-03-22 00:06:55
14  rg...     2.3.4.242  2019-03-21        22:14:39          1553220879  2019-03-22      00:06:57          2             <NA>                   1553220879    2019-03-22 00:06:57
15  rs...     1.2.3.158  2019-03-21        21:01:00          1553216460  2019-03-22      00:07:52        140             <NA>                   1553216460    2019-03-22 00:07:52
16  rg...     2.3.4.242  2019-03-21        22:14:39          1553220879  2019-03-22      00:07:58          2             <NA>                   1553220879    2019-03-22 00:07:58
17  rs...     1.2.3.158  2019-03-21        21:01:00          1553216460  2019-03-22      00:08:55        140             <NA>                   1553216460    2019-03-22 00:08:55
18  rg...     2.3.4.242  2019-03-21        22:14:39          1553220879  2019-03-22      00:08:58          2             <NA>                   1553220879    2019-03-22 00:08:58
19  rs...     1.2.3.158  2019-03-21        21:01:00          1553216460  2019-03-22      00:09:54        140             <NA>                   1553216460    2019-03-22 00:09:54
20  rg...     2.3.4.242  2019-03-21        22:14:39          1553220879  2019-03-22      00:09:58          2             <NA>                   1553220879    2019-03-22 00:09:58
550 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      11:51:51        376             <NA>                   1553258874    2019-03-22 11:51:51
551 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      11:52:52        376             <NA>                   1553258874    2019-03-22 11:52:52
552 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      11:53:55        380             <NA>                   1553258874    2019-03-22 11:53:55
553 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      11:54:55        383             <NA>                   1553258874    2019-03-22 11:54:55
554 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      11:55:55        386             <NA>                   1553258874    2019-03-22 11:55:55
555 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      11:56:55        388             <NA>                   1553258874    2019-03-22 11:56:55
556 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      11:57:55        391             <NA>                   1553258874    2019-03-22 11:57:55
557 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      11:58:55        395             <NA>                   1553258874    2019-03-22 11:58:55
558 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      11:59:55        395             <NA>                   1553258874    2019-03-22 11:59:55
559 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      12:00:55        395             <NA>                   1553258874    2019-03-22 12:00:55
560 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      12:01:56        395             <NA>                   1553258874    2019-03-22 12:01:56
561 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      12:02:55        396             <NA>                   1553258874    2019-03-22 12:02:55
562 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      12:03:56        400             <NA>                   1553258874    2019-03-22 12:03:56
563 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      12:04:56        412             <NA>                   1553258874    2019-03-22 12:04:56
564 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      12:05:56        418             <NA>                   1553258874    2019-03-22 12:05:56
565 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      12:06:56        419             <NA>                   1553258874    2019-03-22 12:06:56
566 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      12:07:56        420             <NA>                   1553258874    2019-03-22 12:07:56
567 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      12:08:56        423             <NA>                   1553258874    2019-03-22 12:08:56
568 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      12:09:56        426             <NA>                   1553258874    2019-03-22 12:09:56
569 cr...     4.5.6.150  2019-03-22        08:47:54          1553258874  2019-03-22      12:10:56        427             cr...: 427 MB\n[8874]  1553258874    2019-03-22 12:10:56

dput(DATA),经过精简以匹配上面的数据。

structure(list(

username = structure(c(3L, 2L, 3L, 2L, 3L, 2L, 2L, 3L, 3L, 2L, 3L, 2L, 3L, 2L, 
3L, 2L, 3L, 2L, 3L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L), .Label = c("cr...", "rg...", "rs..."), class = "factor"), 

IP = structure(c(1L, 3L, 1L, 3L, 1L, 3L, 3L, 1L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 3L, 
1L, 3L, 1L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L), .Label = c("108.236.141.158", "24.101.215.15", "71.72.44.242"), 
class = "factor"), 

DateSessionStart = c("2019-03-21", "2019-03-21", "2019-03-21", "2019-03-21", 
"2019-03-21", "2019-03-21", "2019-03-21", "2019-03-21", "2019-03-21", 
"2019-03-21", "2019-03-21", "2019-03-21", "2019-03-21", "2019-03-21", 
"2019-03-21", "2019-03-21", "2019-03-21", "2019-03-21", "2019-03-21", 
"2019-03-21", "2019-03-22", "2019-03-22", "2019-03-22", "2019-03-22", 
"2019-03-22", "2019-03-22", "2019-03-22", "2019-03-22", "2019-03-22", 
"2019-03-22", "2019-03-22", "2019-03-22", "2019-03-22", "2019-03-22", 
"2019-03-22", "2019-03-22", "2019-03-22", "2019-03-22", "2019-03-22", 
"2019-03-22"), 

TimeSessionStart = c("21:01:00", "22:14:39", "21:01:00", "22:14:39", "21:01:00", 
"22:14:39", "22:14:39", "21:01:00", "21:01:00", "22:14:39", "21:01:00", 
"22:14:39", "21:01:00", "22:14:39", "21:01:00", "22:14:39", "21:01:00", 
"22:14:39", "21:01:00", "22:14:39", "08:47:54", "08:47:54", "08:47:54", 
"08:47:54", "08:47:54", "08:47:54", "08:47:54", "08:47:54", "08:47:54", 
"08:47:54", "08:47:54", "08:47:54", "08:47:54", "08:47:54", "08:47:54", 
"08:47:54", "08:47:54", "08:47:54", "08:47:54", "08:47:54"), 

SessionId = c(1553216460L, 1553220879L, 1553216460L, 1553220879L, 1553216460L, 
1553220879L, 1553220879L, 1553216460L, 1553216460L, 1553220879L, 1553216460L, 
1553220879L, 1553216460L, 1553220879L, 1553216460L, 1553220879L, 1553216460L, 
1553220879L, 1553216460L, 1553220879L, 1553258874L, 1553258874L, 1553258874L, 
1553258874L, 1553258874L, 1553258874L, 1553258874L, 1553258874L, 1553258874L, 
1553258874L, 1553258874L, 1553258874L, 1553258874L, 1553258874L, 1553258874L, 
1553258874L, 1553258874L, 1553258874L, 1553258874L, 1553258874L), 

DateLastUpdate = structure(c(17977, 17977, 17977, 17977, 17977, 17977, 17977, 
17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 
17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 
17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 
17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 17977, 
17977, 17977, 17977), class = "Date"), 

TimeLastUpdate = c("00:00:44", "00:00:53", "00:01:50", "00:01:53", "00:02:50", 
"00:02:54", "00:03:55", "00:03:55", "00:04:40", "00:04:55", "00:05:50", 
"00:05:56", "00:06:55", "00:06:57", "00:07:52", "00:07:58", "00:08:55", 
"00:08:58", "00:09:54", "00:09:58", "11:51:51", "11:52:52", "11:53:55", 
"11:54:55", "11:55:55", "11:56:55", "11:57:55", "11:58:55", "11:59:55", 
"12:00:55", "12:01:56", "12:02:55", "12:03:56", "12:04:56", "12:05:56", 
"12:06:56", "12:07:56", "12:08:56", "12:09:56", "12:10:56"), 

TotalBandwidth = c(140, 2, 140, 2, 140, 2, 2, 140, 140, 2, 140, 2, 140, 2, 140, 
2, 140, 2, 140, 2, 376, 376, 380, 383, 386, 388, 391, 395, 395, 395, 395, 396, 
400, 412, 418, 419, 420, 423, 426, 427), 

label_flag = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, "cr...: 427 MB\n[8874]"), 

SessionIdFac = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 
2L, 1L, 2L, 1L, 2L, 1L, 2L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L), .Label = c("1553216460", "1553220879", "1553239153", "1553240262", "1553244013", 
"1553258796", "1553258874"), class = "factor"), 

DateTimeLastUpdate = structure(c(1553230844, 1553230853, 1553230910, 1553230913, 
1553230970, 1553230974, 1553231035, 1553231035, 1553231080, 1553231095, 
1553231150, 1553231156, 1553231215, 1553231217, 1553231272, 1553231278, 
1553231335, 1553231338, 1553231394, 1553231398, 1553273511, 1553273572, 
1553273635, 1553273695, 1553273755, 1553273815, 1553273875, 1553273935, 
1553273995, 1553274055, 1553274116, 1553274175, 1553274236, 1553274296, 
1553274356, 1553274416, 1553274476, 1553274536, 1553274596, 1553274656), 
class = c("POSIXct", "POSIXt"), tzone = "EST")), class = "data.frame", 

row.names = c(NA, -569L))

1 个答案:

答案 0 :(得分:0)

对于您的第一个问题,我会尝试expand_limits() 如果您仅在该时间有一栏,那么应该可以执行以下操作:

ggplot(...) +
geom_point(...) +
expand_limits(x =c(hms::as.hms("00:00:00"),hms::as.hms("23:59:00")))

或者,您需要定义限制的日期时间。

这里有两种选择,可以用用户名标记会话

# create a named vector for the legend
user <- user_tmp %>% pull(username)
names(user) <- user_tmp$SessionId

# alternative one multiple days in a chart 
# own column for times needs to be converted into a time
DATA <- DATA %>% 
  mutate(TimeLastUpdate = hms::as.hms(TimeLastUpdate))

# alternative one plot with times column
DATA %>% 
  ggplot(aes(x = TimeLastUpdate, y = TotalBandwidth, color = 
  SessionIdFac)) +
  geom_line() +
  expand_limits(x =c(hms::as.hms("00:00:00"),hms::as.hms("23:59:00"))) + #expand limits on the time
  scale_color_discrete(labels = user) # labels the sessions with the user name

  # alternative 2 plotting one day with datetime column
  DATA %>% 
   filter(DateLastUpdate == "2019-03-22") %>%  # filter on the day
   ggplot(aes(x = DateTimeLastUpdate, y = TotalBandwidth, color = SessionIdFac )) +
   geom_line() +
   expand_limits(x = c(as.POSIXct("2019-03-22 00:00:00"),  
   as.POSIXct("2019-03-22 23:59:00"))) + # expand limits for the day
   scale_color_discrete(label = user) # label the sessions with user name