我想将不同货币的某些价格转换为特定货币。 假设我有这个:
library(data.table)
set.seed(100)
DT <- data.table(day=1:10, price=runif(10), currency=c("aud","eur"),
aud=runif(10) + 1, eur=runif(10) + 1.5)
DT
day price currency aud eur
1: 1 0.30776611 aud 1.624996 2.035811
2: 2 0.25767250 eur 1.882166 2.210804
3: 3 0.55232243 aud 1.280354 2.038349
4: 4 0.05638315 eur 1.398488 2.248972
5: 5 0.46854928 aud 1.762551 1.920101
6: 6 0.48377074 eur 1.669022 1.671420
7: 7 0.81240262 aud 1.204612 2.270302
8: 8 0.37032054 eur 1.357525 2.381954
9: 9 0.54655860 aud 1.359475 2.049097
10: 10 0.17026205 eur 1.690291 1.777724
每天的价格以货币栏中显示的相应货币表示。所以第一天的0.30776611是澳元(澳元),欧元(欧元)是0.25767250。列aud
和eur
列以美元显示各货币的汇率。如何以data.table
方式创建以美元表示的新价格列?
我需要使用基于price
的相应列名称多个currency
才能获得此内容:
DT
day price currency aud eur price.in.usd
1: 1 0.30776611 aud 1.624996 2.035811 0.5001187
2: 2 0.25767250 eur 1.882166 2.210804 0.5696634
3: 3 0.55232243 aud 1.280354 2.038349 0.7071682
4: 4 0.05638315 eur 1.398488 2.248972 0.1268041
5: 5 0.46854928 aud 1.762551 1.920101 0.825842
6: 6 0.48377074 eur 1.669022 1.671420 0.8085841
7: 7 0.81240262 aud 1.204612 2.270302 0.9786299
8: 8 0.37032054 eur 1.357525 2.381954 0.8820865
9: 9 0.54655860 aud 1.359475 2.049097 0.7430328
10: 10 0.17026205 eur 1.690291 1.777724 0.3026789
因此,第一天我乘以price * aud = 0.30776611 * 1.624996
,因为价格在aud
列的currency
,而第二天price * eur = 0.25767250 * 2.210804
出于同样的原因。
真实数据包括大约40种货币,因此创建箭头反模式的多个ifelse()
不是很方便。
目前,通过我的数据的子样本,我有这个:
DT.all[, price := ifelse(curcdd=="AUD", adj.price * AUD,
ifelse(curcdd=="BEF", adj.price * BEF,
ifelse(curcdd=="BGN", adj.price * BGN,
ifelse(curcdd=="CHF", adj.price * CHF,
ifelse(curcdd=="CZK", adj.price * CZK,
ifelse(curcdd=="DEM", adj.price * DEM,
ifelse(curcdd=="EUR", adj.price * EUR,
ifelse(curcdd=="FRF", adj.price * FRF,
ifelse(curcdd=="GBP", adj.price * GBP,
ifelse(curcdd=="ILS", adj.price * ILS,
ifelse(curcdd=="JPY", adj.price * JPY,
ifelse(curcdd=="NLG", adj.price * NLG,
ifelse(curcdd=="NOK", adj.price * NOK,
ifelse(curcdd=="PLN", adj.price * PLN,
ifelse(curcdd=="SEK", adj.price * SEK,
ifelse(curcdd=="SGD", adj.price * SGD,
ifelse(curcdd=="USD", adj.price, NA)))))))))))))))))]
哪个有效,但它只有大约20种货币,所有这些货币(约40种)肯定不优雅......
非常感谢!
答案 0 :(得分:4)
[编辑]使用get
来提取我在Matthew Dowle的回答中看到的列名引用的值的想法似乎是有效的:
setkey(DT, currency)
DT[ , cvt := .SD[, get(currency)]*price, by=currency]
DT
day price currency aud eur cvt
1: 1 0.30776611 aud 1.624996 2.035811 0.5001188
2: 3 0.55232243 aud 1.280354 2.038349 0.7071681
3: 5 0.46854928 aud 1.762551 1.920101 0.8258420
4: 7 0.81240262 aud 1.204612 2.270302 0.9786301
5: 9 0.54655860 aud 1.359475 2.049097 0.7430328
6: 2 0.25767250 eur 1.882166 2.210804 0.5696634
7: 4 0.05638315 eur 1.398488 2.248972 0.1268041
8: 6 0.48377074 eur 1.669022 1.671420 0.8085842
9: 8 0.37032054 eur 1.357525 2.381954 0.8820863
10: 10 0.17026205 eur 1.690291 1.777724 0.3026789
这是一种方法,虽然它并没有很好地推广到更多的货币:
DT[ , cvt := ifelse (currency == 'aud', price*aud, price*eur) ]
> DT
day price currency aud eur cvt
1: 1 0.30776611 aud 1.624996 2.035811 0.5001188
2: 2 0.25767250 eur 1.882166 2.210804 0.5696634
3: 3 0.55232243 aud 1.280354 2.038349 0.7071681
4: 4 0.05638315 eur 1.398488 2.248972 0.1268041
5: 5 0.46854928 aud 1.762551 1.920101 0.8258420
6: 6 0.48377074 eur 1.669022 1.671420 0.8085842
7: 7 0.81240262 aud 1.204612 2.270302 0.9786301
8: 8 0.37032054 eur 1.357525 2.381954 0.8820863
9: 9 0.54655860 aud 1.359475 2.049097 0.7430328
10: 10 0.17026205 eur 1.690291 1.777724 0.3026789
您收到警告(如果您尝试使用if(.){.}else{.}
,则会收到不同的结果:
DT[ , cvt := if (currency == 'aud'){price*aud}else{price*eur}]
这与data.frames完全类似。但是......在data.table中使用ifelse
已经很慢了。
答案 1 :(得分:1)
在此解决方案中,您需要指定不同货币的数量(在本例中为2
)和观察数量(在本例中为10
),并且还假定货币值('aud','eur'
等)是最后几列。
> B_msk <- matrix(rep(DT$currency,2), ncol=2, byrow=TRUE)==matrix(rep(colnames(DT)[-(1:3)], 10), ncol=2)
> DF <- data.frame(DT)
> DF$in_USD <- rowSums(DF[colnames(DT)[-(1:3)]]*B_msk*DF$price)
> DF #or data.table(DF)
day price currency aud eur in_USD
1 1 0.30776611 aud 1.624996 2.035811 0.5001188
2 2 0.25767250 eur 1.882166 2.210804 0.5696634
3 3 0.55232243 aud 1.280354 2.038349 0.7071681
4 4 0.05638315 eur 1.398488 2.248972 0.1268041
5 5 0.46854928 aud 1.762551 1.920101 0.8258420
6 6 0.48377074 eur 1.669022 1.671420 0.8085842
7 7 0.81240262 aud 1.204612 2.270302 0.9786301
8 8 0.37032054 eur 1.357525 2.381954 0.8820863
9 9 0.54655860 aud 1.359475 2.049097 0.7430328
10 10 0.17026205 eur 1.690291 1.777724 0.3026789
希望此解决方案解决内存问题,(但仍需要将数据放在data.frame
)
> Idx=cbind(1:10,match(DT[,currency], colnames(DT))) #replace 10 with the actually np. of obs.
> DF=data.frame(DT)
> DF
day price currency aud eur
1 1 0.30776611 aud 1.624996 2.035811
2 2 0.25767250 eur 1.882166 2.210804
3 3 0.55232243 aud 1.280354 2.038349
4 4 0.05638315 eur 1.398488 2.248972
5 5 0.46854928 aud 1.762551 1.920101
6 6 0.48377074 eur 1.669022 1.671420
7 7 0.81240262 aud 1.204612 2.270302
8 8 0.37032054 eur 1.357525 2.381954
9 9 0.54655860 aud 1.359475 2.049097
10 10 0.17026205 eur 1.690291 1.777724
> DF$price*as.numeric(DF[Idx]) #assign it as 'DF$P_in_USD'
[1] 0.5001187 0.5696634 0.7071682 0.1268041 0.8258420 0.8085841 0.9786299 0.8820865 0.7430327 0.3026789
答案 2 :(得分:1)
您是否考虑过简单地循环货币,过滤主数据框只保留给定货币的价格,在子集数据框中执行转换,最后堆叠所有货币数据框(或逐步填充主数据框中的列)< / p>