有效地将每行值转换为另一种货币

时间:2020-04-22 13:20:30

标签: r

假设这两个data.frames

rates <- data.frame(Date = c("2000-01-01","2000-03-02","2000-03-25","2000-04-13"), Euro = c(1.2,1.27,1.4,1.6), Aussie = c(0.85,0.82,0.93,0.89))
toConvert <- data.frame(Date = c("2000-01-01","2000-03-02","2000-03-25","2000-04-13"), currency = c("Euro","Aussie","Euro","Aussie"),
                        Value1 = c(5,6,7,8), Value2 = c(10,15,23,85),Value3 = c(50,60,89,93))

给定Value1表中的汇率,是否存在一种有效的方法将Value2Value3rates列转换为美元?

从本质上讲,代码将必须根据Date表中的currencytoConvert列查找费率,并将Value1Value2Value3及其各自在rates表中的汇率。

我已经尝试过使用for循环,该循环可以正常工作,但要花很长时间。 (请记住,这些是示例表。实际的toConvert表包含10万行数据,而rates表具有1990年以来的每日数据。)

我尝试过的事情:

for (i in 1:nrow(toConvert)) {
  rate <- as.numeric(rates[rates$Date == toConvert[i,]$Date, as.character(toConvert[1,]$currency)])
  toConvert[i,]$Value1 <- toConvert[i,]$Value1 / rate
  toConvert[i,]$Value2 <- toConvert[i,]$Value2 / rate
  toConvert[i,]$Value3 <- toConvert[i,]$Value3 / rate
}

使用100k行需要花费大量时间。

预期输出:

    Date    currency   Value1    Value2    Value3
2000-01-01     Euro 4.166667  8.333333  41.66667
2000-03-02   Aussie 7.317073 18.292683  73.17073
2000-03-25     Euro 5.000000 16.428571  63.57143
2000-04-13   Aussie 8.988764 95.505618 104.49438

4 个答案:

答案 0 :(得分:3)

如果您有大量数据,我建议您看看data.table软件包

library(data.table)

setDT(rates)
setDT(toConvert)

lrates <- melt(rates,id.vars = "Date")
toConvert[lrates,c("Value1", "Value2", "Value3"):=.SD/i.value,on=.(Date,currency=variable),.SDcols = c("Value1", "Value2", "Value3")]

toConvert
#>          Date currency   Value1    Value2    Value3
#> 1: 2000-01-01     Euro 4.166667  8.333333  41.66667
#> 2: 2000-03-02   Aussie 7.317073 18.292683  73.17073
#> 3: 2000-03-25     Euro 5.000000 16.428571  63.57143
#> 4: 2000-04-13   Aussie 8.988764 95.505618 104.49438

reprex package(v0.3.0)于2020-04-22创建

答案 1 :(得分:1)

使用tidyverse(dplyr和tidyr)一次性处理整个data.frame。

rates <- data.frame(
  Date = c("2000-01-01","2000-03-02","2000-03-25","2000-04-13"), 
  Euro = c(1.2,1.27,1.4,1.6), 
  Aussie = c(0.85,0.82,0.93,0.89))
# Note I've added a capital C to "Currency" for consistency
toConvert <- data.frame(
  Date = c("2000-01-01","2000-03-02","2000-03-25","2000-04-13"), 
  Currency = c("Euro","Aussie","Euro","Aussie"),
  Value1 = c(5,6,7,8), Value2 = c(10,15,23,85), Value3 = c(50,60,89,93))


library("dplyr")
library("tidyr")

# Gathering the rate in a tidyer format
rates <- rates %>%
  tidyr::gather(Currency, Rate, Euro, Aussie)

rates
#>         Date Currency Rate
#> 1 2000-01-01     Euro 1.20
#> 2 2000-03-02     Euro 1.27
#> 3 2000-03-25     Euro 1.40
#> 4 2000-04-13     Euro 1.60
#> 5 2000-01-01   Aussie 0.85
#> 6 2000-03-02   Aussie 0.82
#> 7 2000-03-25   Aussie 0.93
#> 8 2000-04-13   Aussie 0.89

# Add the rate with a join - this is the trick
toConvert %>%
  left_join(rates, by = c("Date", "Currency")) %>% 
# From there it is easy to convert everything at once
  mutate_at(vars(starts_with("Value")), ~ .x / Rate)
#> Warning: Column `Currency` joining factor and character vector, coercing into
#> character vector
#>         Date Currency   Value1    Value2    Value3 Rate
#> 1 2000-01-01     Euro 4.166667  8.333333  41.66667 1.20
#> 2 2000-03-02   Aussie 7.317073 18.292683  73.17073 0.82
#> 3 2000-03-25     Euro 5.000000 16.428571  63.57143 1.40
#> 4 2000-04-13   Aussie 8.988764 95.505618 104.49438 0.89

reprex package(v0.3.0)于2020-04-22创建

加入时,您可以忽略该警告,也可以通过确保Currency是字符vetcor而不是因素(通常在加载数据时使用stringsAsFactors = FALSE来解决)。

答案 2 :(得分:0)

array[:,j]

答案 3 :(得分:0)

整洁的解决方案...

library(dplyr)

tidy_rates <- rates %>% pivot_longer(cols = -c(Date), names_to = "currency", values_to = "rate")

toConvert %>% left_join(tidy_rates, by = c("Date" = "Date", "currency" = "currency")) %>% 
  mutate(
    value1_converted = Value1/rate,
    value2_converted = Value2/rate,
    value3_converted = Value3/rate
  )

输出

enter image description here