我有大量的交易数据可以跟踪购买,退货以及销售点运营商何时收到付款/退款后清算交易。我希望能够根据收银员“清除”屏幕的时间对会话进行编号,并且在清除编号相同的情况下进行所有交易。
我提取了所有非必要数据,但这里是dput()的样子:
my.data.1<-structure(list(TOTSND_Clear = c("0", "0", "0", "0", "0", "0",
"4.00", "0", "0", "10.00", "0", "0", "12.00", "0", "-5.00"),
TOTSND_UNBAL = c("0", "1.00", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0")), .Names = c("TOTSND_Clear",
"TOTSND_UNBAL"), row.names = c(NA, 15L), class = "data.frame")
看起来像这样:
TOTSND_Clear TOTSND_UNBAL
0 0
0 1.00
0 0
0 0
0 0
0 0
4.00 0
所有这些零都表示发生的其他形式的交易,无论是出售还是退款。当TOTSND_Clear或TOTSND_UNBAL具有值时,表示事务实例正在结束。这些数字是美元金额,而不是交易类型的数量(在这个例子中恰好看起来像这样)。
我想产生这些结果:
my.data.results<-structure(list(TOTSND_Clear = c("0", "0", "0", "0", "0", "0",
"4.00", "0", "0", "10.00", "0", "0", "12.00", "0", "-5.00"),
TOTSND_UNBAL = c("0", "1.00", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0"), session = c(1, 1, 2,
2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5)), .Names = c("TOTSND_Clear",
"TOTSND_UNBAL", "session"), row.names = c(NA, 15L), class = "data.frame")
看起来像这样:
TOTSND_Clear TOTSND_UNBAL session
0 0 1
0 1.00 1
0 0 2
0 0 2
0 0 2
0 0 2
4.00 0 2
我会放置代码,但我不知道从哪里开始。我已经找到了为实例编号的方法,但是没有为清除数据之前发生的字段分配相同的编号,而是在上一次清除之后。
答案 0 :(得分:2)
也许是这样的......?
ind <- which(with(my.data.1,TOTSND_Clear != 0 | TOTSND_UNBAL != 0))
> rep(seq_along(ind),times = c(ind[1],diff(ind)))
[1] 1 1 2 2 2 2 2 3 3 3 4 4 4 5 5
然后您可以将其添加为列。
答案 1 :(得分:2)
这是一种方式:
c(1, cumsum(diff(as.logical(rowSums(
my.data.1[c("TOTSND_Clear", "TOTSND_UNBAL")] != 0))) < 0) + 1)
# [1] 1 1 2 2 2 2 2 3 3 3 4 4 4 5 5