我有以下数据:
df <-
Session Volume StartTime EndTime
1 27,75 2016-01-22 17:00:33.707 2016-01-27 06:02:54.900
2 10,78 2016-01-22 14:31:22.127 2016-01-23 15:01:20.997
3 15,88 2016-01-27 12:46:18.660 2016-01-27 15:01:23.250
4 46,10 2016-01-25 16:01:34.613 2016-01-25 21:46:35.477
5 94,60 2016-01-27 05:38:06.597 2016-01-27 06:08:06.027
6 15,93 2016-01-20 16:15:59.350 2016-01-21 06:06:43.933
我想添加每天的数据(在开始时间)以接收具有每天所有会话总量的数据集,以便我可以随时间绘制音量。例如。在2016年1月22日,共有38.53人被指控。 那么
dfnew <-
Day TotalVolume
2016-01-22 38.53
2016-01-25 46.10
2016-01-27 110.48
etc.
最有效的方法是什么?
答案 0 :(得分:1)
使用数据表
library(data.table)
df[,StartTime := as.POSIXct(StartTime)]
df[,sum(Volume), by = as.Date(df$StartTime)]
as.Date V1
1: 2016-01-22 38.53
2: 2016-01-27 110.48
3: 2016-01-25 46.10
4: 2016-01-20 15.93
和dplyr
library(dplyr)
df %>%
mutate(StartTime = as.POSIXct(StartTime)) %>%
group_by(as.Date(StartTime)) %>%
summarise(sum(Volume))
以下是数据:
df <- as.data.table(read.table(text = "
Session; Volume; StartTime; EndTime
1; 27,75; 2016-01-22 17:00:33.707; 2016-01-27 06:02:54.900
2; 10,78; 2016-01-22 14:31:22.127; 2016-01-23 15:01:20.997
3; 15,88; 2016-01-27 12:46:18.660; 2016-01-27 15:01:23.250
4; 46,10; 2016-01-25 16:01:34.613; 2016-01-25 21:46:35.477
5; 94,60; 2016-01-27 05:38:06.597; 2016-01-27 06:08:06.027
6; 15,93; 2016-01-20 16:15:59.350; 2016-01-21 06:06:43.933",header = T,sep = ";",dec = ","))
答案 1 :(得分:0)
我准备了一个可重复的小例子。这是你想要的吗?
library(lubridate)
library(dplyr)
df=tibble(Volume=c(27.75,10.78,15.88,46.1,94.60,15.93),
StartTime=c("2016-01-22 17:00:33.707","2016-01-22 14:31:22.127",
"2016-01-27 12:46:18.660","2016-01-25 16:01:34.613","2016-01-27 05:38:06.597","2016-01-20 16:15:59.350"))
df <-df%>%
mutate(StartTime=ymd_hms(StartTime))%>%
mutate(StartTime=floor_date(StartTime,unit="day"))%>%
group_by(StartTime)%>%dplyr::summarize(Volume=sum(Volume))
> df
# A tibble: 4 x 2
StartTime Volume
<dttm> <dbl>
1 2016-01-20 00:00:00 15.9
2 2016-01-22 00:00:00 38.5
3 2016-01-25 00:00:00 46.1
4 2016-01-27 00:00:00 110