R识别和汇总没有历史记录的余额

时间:2018-03-19 13:32:59

标签: r data.table reshape aggregation

我有一组余额超过4个月的帐户。我想要一个刚刚出现在特定月份的余额。这是我到目前为止所得到的。

每个月创建一个帐户(新)。

Accounts <- c('A','B','C','A','B','C','A','B','C')
Dates <- as.Date(c('2016-01-31', '2016-01-31','2016-01-31','2016-02-28','2016-02-28','2016-02-28','2016-03-31','2016-03-31','2016-03-31'))
Balances <- c(100,NA,NA,90,50,NA,80,40,120)
Origination <- data.frame(Dates,Accounts,Balances)

library(reshape2)
Origination <- dcast(Origination,Dates ~ Accounts, value.var = "Balances")
Origination$Originated <- apply(Origination[2:4],1,function(x) ifelse(sum(is.na(x))==nrow(Origination),NA,tail(na.omit(x),1)))
Origination <- melt(Origination, id = c("Dates"))
Origination <-dcast(Origination, variable ~ Dates, value.var = "value")

    variable 2016-01-31 2016-02-29 2016-03-31
1          A        100         90         80
2          B         NA         50         40
3          C         NA         NA        120
4 Originated        100         50        120

这将创建一个名为Originated的行的原始表。第一个月我们只有100,第二个月我们有摊销的A到90,但也有一个新帐户50和上个月我们都有摊销的A和B,新的C在120.起源列完全按照我的要求捕获它。

但是如果我在第2个月引入另一个账户D,那么它只选择那个数量(10)而不是两个正在产生的总和。即50(B)加10(C)。

Accounts <- c('A','B','C','D','A','B','C','D','A','B','C','D')
Dates <- as.Date(c('2016-01-31', '2016-01-31','2016-01-31','2016-01-31','2016-02-28','2016-02-28','2016-02-28','2016-02-28','2016-03-31','2016-03-31','2016-03-31','2016-03-31'))
Balances <- c(100,NA,NA,NA,90,50,10,NA,80,40,5,120)
Origination <- data.frame(Dates,Accounts,Balances)

library(reshape2)
Origination <- dcast(Origination,Dates ~ Accounts, value.var = "Balances")
Origination$Originated <- apply(Origination[2:4],1,function(x) ifelse(sum(is.na(x))==nrow(Origination),NA,tail(na.omit(x),1)))
Origination <- melt(Origination, id = c("Dates"))
Origination <-dcast(Origination, variable ~ Dates, value.var = "value")

    variable 2016-01-31 2016-02-28 2016-03-31
1          A        100         90         80
2          B         NA         50         40
3          C         NA         10          5
4          D         NA         NA        120
5 Originated        100         10          5

所以请问是,如何在日期中对A到D中新添加的帐户进行汇总。也许我在想它。我想要的结果是:

    variable 2016-01-31 2016-02-28 2016-03-31
1          A        100         90         80
2          B         NA         50         40
3          C         NA         10          5
4          D         NA         NA        120
5 Originated        100         60        120

非常感谢帮助。 阿克塞尔

1 个答案:

答案 0 :(得分:0)

我终于找到了获得我想要的输出的方法。以下是感兴趣的人的答案。

sel <- rbind(FALSE, !is.na(head(Origination[-1], -1)))
#sel
#         A     B     C     D
#[1,] FALSE FALSE FALSE FALSE
#[2,]  TRUE FALSE FALSE FALSE
#[3,]  TRUE  TRUE  TRUE FALSE

rowSums(replace(Origination[-1], sel, 0), na.rm=TRUE)
#[1] 100  60 120