我有一个包含这些变量的交易数据集: 你可以在这里下载:https://yadi.sk/d/BIXivmVJ34Akbn
它有点不同,相反,如果 id ,则客户ID
id,mmc_code - 交易代码,tr_datetime,tr_type - 交易类型,金额,term_id - 终端ID,性别。
我想创建一个新列trans_count,它是每人每天的交易次数(id)。我怎样才能做到这一点?非常感谢。
我在这里分开了日期和时间。
trans_test<-read_csv("~/shared/minor3_2017/3-SecondYear-ML/hw_data/transactions_train.csv")
trans_train <- separate (trans_train, col=tr_datetime, into=c("day", "time"), sep=" ")
trans_train$day<-as.integer(trans_train$day)
dput(head(trans_train))
输出
structure(list(day = c(0L, 0L, 0L, 0L, 0L, 0L), time = c("03:16:05",
"11:36:09", "11:37:11", "12:20:45", "12:36:57", "13:53:33"),
mcc_code = c(6011L, 5499L, 5411L, 5912L, 5499L, 4814L), tr_type = c(2010L,
1010L, 1010L, 1010L, 1010L, 1030L), amount = c(-950, -13.5,
-271.43, -134, -544, -100), term_id = c(NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_
), id = c(1726L, 1726L, 1726L, 1726L, 1726L, 1726L)), .Names = c("day",
"time", "mcc_code", "tr_type", "amount", "term_id", "id"), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
答案 0 :(得分:0)
我不知道以您描述的方式添加列的简洁方法。但是,如果要创建新的摘要表,可以使用:
library(dplyr)
trans_train %>%
group_by(day, id) %>%
summarize(transactions_per_day_per_costumer = n())