我有一张卡片交易数据集。
date_time <- c("2016-07-10 21:04:00", "2016-07-10 21:04:00" , "2016-07-10 21:05:00" , "2016-07-10 21:06:00", "2016-07-10 21:07:00" , "2016-07-10 21:08:00" ,"2016-10-22 12:48:00" ,"2016-10-22 12:49:00" ,"2016-10-22 12:50:00" ,"2016-10-22 12:50:00", "2016-10-22 12:51:00", "2016-10-22 12:51:00" ,"2016-10-22 12:52:00", "2016-10-22 12:52:00" ,"2016-10-22 12:53:00","2016-10-22 12:54:00", "2016-10-22 12:54:00", "2016-10-22 12:55:00" , "2016-10-22 12:59:00", "2016-11-10 20:48:00", "2016-11-10 20:48:00", "2016-11-09 19:19:00" ,"2016-11-09 19:20:00")
card_no <- c("9", "9", "9", "9", "9", "9", "8", "8", "8", "8", "8", "8", "7", "7", "7", "7", "7", "7", "7", "4", "4", "3", "3")
txn_code <- c("4551", "4571", "4571", "4571", "4551", "4571", "4551", "4571", "4571", "4551", "4551", "4571", "4551", "4571", "4571", "4551", "4571", "4551", "4571","4551","4571" , "4571", "4551")
card_txn_df <- data.frame(date_time = date_time, card_no = card_no, txn_code = txn_code)
我想要做的是在数据框中附加一个列,它给出了这个特定事务的+ - 3分钟内的txns数量;同一张卡片。
除了按卡号分组外,我还有一个版本正常工作。
ret_adj <- function(this_date, all_dates) {
length(all_dates[abs(difftime(this_date, all_dates, units = "secs")) <= 180])
}
adj_txn <- unlist(lapply(card_txn_df$date_time, function(x) {ret_adj(x, card_txn_df$date_time)}))
我无法弄清楚如何在lapply中引入group_by card_no或通过dplyr