我有一个如下所示的数据集(称为戳记数据):
Date_Time Cost
--------- -----
01/02/2015 01:52 PM 6
01/02/2015 02:22 PM 2
01/03/2015 02:42 PM 50
01/04/2015 03:01 PM 25
和不同的数据集(客户数据)如下所示:
Purchase_time Amount
------------- ---------
01/02/2015 01:57 PM 5
01/02/2015 02:46 PM 12
01/02/2015 03:13 PM 2
01/02/2015 03:30 PM 8
我想从不同时间窗口的戳记数据中将Date_Time列中的客户数据中的“Amount”列相加,最终结果如下所示:
Date_Time Cost Amount_15min Amount_30min
--------- ----- -------------- -------------
01/02/2015 01:52 PM 6 5 5
01/02/2015 02:22 PM 2 0 12
01/03/2015 02:42 PM 50 12 12
01/04/2015 03:01 PM 25 8 8
理想情况下,我想创建15分钟间隔的列,直到360分钟(或更长时间)
我怎样才能在R中这样做?
谢谢!
答案 0 :(得分:0)
我想你会发现大部分代码都是直截了当的。我们需要将日期转换为POSIX对象以对它们执行数学运算。 POSIX对象存储为整数,表示自1970年1月1日以来经过的秒数,因此在对它们执行数学运算时,我们将转换为数字,然后从中添加/减去秒数。
### Build test data frame
### times is a character vector and cost is a numeric vector
times <- c(
"01/02/2015 01:52 PM",
"01/02/2015 01:57 PM",
"01/02/2015 01:58 PM",
"01/02/2015 02:52 PM",
"01/02/2015 02:55 PM")
cost <- c(8, 2, 50, 26, 7)
df <- data.frame(times = times, cost = cost, stringsAsFactors = FALSE)
#### convert times to POSIX dates
df$times <- strptime(df$times, format = "%m/%d/%Y %I:%M %p")
### polling frequency in minutes
pollinglength <- 15
### create empty vector to hold sums
amount <- rep(NA, nrow(df))
for( i in 1:nrow(df)){
### POSIX support comparison operators well
upperWindow <- df$times <= df$times[i]
### POSIX does not support addition/subtraction well, so we will convert to numeric first
lowerWindow <- as.numeric(df$times) > (as.numeric(df$times[i]) - pollinglength * 60)
amount[i] <- sum(df$cost[ upperWindow & lowerWindow ])
}
### Add back to data frame
df <- cbind(df, amount)