我有一个包含两列的表格,即日期和用户名,我想在R上按天或月绘制每个用户记录的计数。
有没有办法直接这样做?
我的桌子;
0.9
我想在同一个图中显示每个用户的活动级别,以便比较每个用户。它可能是折线图,也可能是您认为更好的另一种图形。
这个问题是如此基本,但直到现在我才能看到任何直接解决方案。还请考虑我是新手。
答案 0 :(得分:0)
因为使用dput
预期输出的可重复数据总是有用的。但是,根据您的数据,这是我最好的镜头。但是,需要更好的可视化。
df <- structure(list(User = structure(c(1L, 2L, 2L, 3L, 4L, 3L, 3L,
2L, 2L, 1L, 1L, 1L, 1L, 4L), .Label = c("a", "b", "c", "d"), class = "factor"),
Time = c("2016-05-02 03:45:11", "2016-05-05 04:05:24", "2016-06-05 07:23:16",
"2016-05-08 08:37:37", "2016-05-09 11:28:15", "2016-08-11 23:41:18",
"2016-05-11 03:51:14", "2016-05-11 06:16:21", "2016-07-15 20:23:35",
"2016-05-16 06:42:53", "2016-05-17 08:52:24", "2016-05-18 09:35:47",
"2016-05-19 03:24:39", "2016-07-12 06:39:26")), .Names = c("User",
"Time"), row.names = c("1", "2", "3", "4", "5", "6", "7", "8",
"9", "10", "11", "12", "13", "14"), class = "data.frame")
library(tidyverse)
library(lubridate)
df_clean <- df %>% group_by(User, hour(Time)) %>% #To prepare for calculating User per hour, For day use day() function from lubridate
mutate(n = n()) %>% distinct(User,n) %>% #Remove duplicted users
ungroup() %>% group_by(`hour(Time)`) %>% mutate(label_ypos=cumsum(n) - 0.5*n) #Labels positions on the graph
ggplot(data=df_clean, aes(x=`hour(Time)`, y=n, fill=User)) +
geom_bar(stat="identity",position = position_stack(reverse = TRUE)) +
geom_text(aes(y=label_ypos, label=n), vjust=.5,
color="white", size=3.5)