我有一个编码问题,我觉得应该很容易。我创建了一个简化的数据集:
DT <- data.table(Bank=rep(c("a","b","c"),4),
Type=rep(c("Ass","Liab"),6),
Amount=c(100,200,300,400,200,300,400,500,200,100,300,100))
# Bank Type Amount SumLiab
# 1: a Ass 100 NA
# 2: b Liab 200 700
# 3: c Ass 300 NA
# 4: a Liab 400 500
# 5: b Ass 200 NA
# 6: c Liab 300 400
# 7: a Ass 400 NA
# 8: b Liab 500 700
# 9: c Ass 200 NA
# 10: a Liab 100 500
# 11: b Ass 300 NA
# 12: c Liab 100 400
我想创建一个变量,它是Type =&#34; Liab&#34;每家银行。所以这没问题:
DT[Type=='Liab',SumLiab:=sum(Amount),by=Bank]
# Bank Type Amount SumLiab
# 1: a Ass 100 NA
# 2: b Liab 200 700
# 3: c Ass 300 NA
# 4: a Liab 400 500
# 5: b Ass 200 NA
# 6: c Liab 300 400
# 7: a Ass 400 NA
# 8: b Liab 500 700
# 9: c Ass 200 NA
# 10: a Liab 100 500
# 11: b Ass 300 NA
# 12: c Liab 100 400
但我想要所有行的这个值,即使Type ==&#39; Ass&#39;。我知道由于DT[Type=='Liab',..]
限制,我现在获得了NA。是否有一种聪明的编码方式来获取所有行的值SumLiab? (因此,当前为SumLiab的NA的row1获得值500)
谢谢! 添
答案 0 :(得分:1)
当我们在'i'中使用Type=='Liab'
时,它只将值插入到由'i'索引的行中。我们可以根据'j'中的Type=='Liab'
对“金额”进行子集,并将其分配(:=
)为新变量。
DT[, SumLiab:= sum(Amount[Type=='Liab']), by =Bank]
DT
# Bank Type Amount SumLiab
#1: a Ass 100 500
#2: b Liab 200 700
#3: c Ass 300 400
#4: a Liab 400 500
#5: b Ass 200 700
#6: c Liab 300 400
#7: a Ass 400 500
#8: b Liab 500 700
#9: c Ass 200 400
#10: a Liab 100 500
#11: b Ass 300 700
#12: c Liab 100 400
答案 1 :(得分:0)