Question

我有一个包含这些变量的交易数据集：你可以在这里下载：https://yadi.sk/d/BIXivmVJ34Akbn

它有点不同，相反，如果 id ，则客户ID

id，mmc_code - 交易代码，tr_datetime，tr_type - 交易类型，金额，term_id - 终端ID，性别。

我想创建一个新列trans_count，它是每人每天的交易次数（id）。我怎样才能做到这一点？非常感谢。

我在这里分开了日期和时间。

trans_test<-read_csv("~/shared/minor3_2017/3-SecondYear-ML/hw_data/transactions_train.csv")
trans_train <- separate (trans_train, col=tr_datetime, into=c("day", "time"), sep=" ")
trans_train$day<-as.integer(trans_train$day)

dput(head(trans_train))

输出

structure(list(day = c(0L, 0L, 0L, 0L, 0L, 0L), time = c("03:16:05", 
"11:36:09", "11:37:11", "12:20:45", "12:36:57", "13:53:33"), 
mcc_code = c(6011L, 5499L, 5411L, 5912L, 5499L, 4814L), tr_type = c(2010L, 
1010L, 1010L, 1010L, 1010L, 1030L), amount = c(-950, -13.5, 
-271.43, -134, -544, -100), term_id = c(NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_
), id = c(1726L, 1726L, 1726L, 1726L, 1726L, 1726L)), .Names = c("day", 
"time", "mcc_code", "tr_type", "amount", "term_id", "id"), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

Answer 1

我不知道以您描述的方式添加列的简洁方法。但是，如果要创建新的摘要表，可以使用：

library(dplyr)

trans_train %>%
        group_by(day, id) %>%
        summarize(transactions_per_day_per_costumer = n())

R：在数据集中创建新列

1 个答案: