假设这两个data.frames
:
rates <- data.frame(Date = c("2000-01-01","2000-03-02","2000-03-25","2000-04-13"), Euro = c(1.2,1.27,1.4,1.6), Aussie = c(0.85,0.82,0.93,0.89))
toConvert <- data.frame(Date = c("2000-01-01","2000-03-02","2000-03-25","2000-04-13"), currency = c("Euro","Aussie","Euro","Aussie"),
Value1 = c(5,6,7,8), Value2 = c(10,15,23,85),Value3 = c(50,60,89,93))
给定Value1
表中的汇率,是否存在一种有效的方法将Value2
,Value3
和rates
列转换为美元?
从本质上讲,代码将必须根据Date
表中的currency
和toConvert
列查找费率,并将Value1
,Value2
和Value3
及其各自在rates
表中的汇率。
我已经尝试过使用for
循环,该循环可以正常工作,但要花很长时间。 (请记住,这些是示例表。实际的toConvert
表包含10万行数据,而rates
表具有1990年以来的每日数据。)
我尝试过的事情:
for (i in 1:nrow(toConvert)) {
rate <- as.numeric(rates[rates$Date == toConvert[i,]$Date, as.character(toConvert[1,]$currency)])
toConvert[i,]$Value1 <- toConvert[i,]$Value1 / rate
toConvert[i,]$Value2 <- toConvert[i,]$Value2 / rate
toConvert[i,]$Value3 <- toConvert[i,]$Value3 / rate
}
使用100k行需要花费大量时间。
预期输出:
Date currency Value1 Value2 Value3
2000-01-01 Euro 4.166667 8.333333 41.66667
2000-03-02 Aussie 7.317073 18.292683 73.17073
2000-03-25 Euro 5.000000 16.428571 63.57143
2000-04-13 Aussie 8.988764 95.505618 104.49438
答案 0 :(得分:3)
如果您有大量数据,我建议您看看data.table软件包
library(data.table)
setDT(rates)
setDT(toConvert)
lrates <- melt(rates,id.vars = "Date")
toConvert[lrates,c("Value1", "Value2", "Value3"):=.SD/i.value,on=.(Date,currency=variable),.SDcols = c("Value1", "Value2", "Value3")]
toConvert
#> Date currency Value1 Value2 Value3
#> 1: 2000-01-01 Euro 4.166667 8.333333 41.66667
#> 2: 2000-03-02 Aussie 7.317073 18.292683 73.17073
#> 3: 2000-03-25 Euro 5.000000 16.428571 63.57143
#> 4: 2000-04-13 Aussie 8.988764 95.505618 104.49438
由reprex package(v0.3.0)于2020-04-22创建
答案 1 :(得分:1)
使用tidyverse(dplyr和tidyr)一次性处理整个data.frame。
rates <- data.frame(
Date = c("2000-01-01","2000-03-02","2000-03-25","2000-04-13"),
Euro = c(1.2,1.27,1.4,1.6),
Aussie = c(0.85,0.82,0.93,0.89))
# Note I've added a capital C to "Currency" for consistency
toConvert <- data.frame(
Date = c("2000-01-01","2000-03-02","2000-03-25","2000-04-13"),
Currency = c("Euro","Aussie","Euro","Aussie"),
Value1 = c(5,6,7,8), Value2 = c(10,15,23,85), Value3 = c(50,60,89,93))
library("dplyr")
library("tidyr")
# Gathering the rate in a tidyer format
rates <- rates %>%
tidyr::gather(Currency, Rate, Euro, Aussie)
rates
#> Date Currency Rate
#> 1 2000-01-01 Euro 1.20
#> 2 2000-03-02 Euro 1.27
#> 3 2000-03-25 Euro 1.40
#> 4 2000-04-13 Euro 1.60
#> 5 2000-01-01 Aussie 0.85
#> 6 2000-03-02 Aussie 0.82
#> 7 2000-03-25 Aussie 0.93
#> 8 2000-04-13 Aussie 0.89
# Add the rate with a join - this is the trick
toConvert %>%
left_join(rates, by = c("Date", "Currency")) %>%
# From there it is easy to convert everything at once
mutate_at(vars(starts_with("Value")), ~ .x / Rate)
#> Warning: Column `Currency` joining factor and character vector, coercing into
#> character vector
#> Date Currency Value1 Value2 Value3 Rate
#> 1 2000-01-01 Euro 4.166667 8.333333 41.66667 1.20
#> 2 2000-03-02 Aussie 7.317073 18.292683 73.17073 0.82
#> 3 2000-03-25 Euro 5.000000 16.428571 63.57143 1.40
#> 4 2000-04-13 Aussie 8.988764 95.505618 104.49438 0.89
由reprex package(v0.3.0)于2020-04-22创建
加入时,您可以忽略该警告,也可以通过确保Currency
是字符vetcor而不是因素(通常在加载数据时使用stringsAsFactors = FALSE
来解决)。
答案 2 :(得分:0)
array[:,j]
答案 3 :(得分:0)
整洁的解决方案...
library(dplyr)
tidy_rates <- rates %>% pivot_longer(cols = -c(Date), names_to = "currency", values_to = "rate")
toConvert %>% left_join(tidy_rates, by = c("Date" = "Date", "currency" = "currency")) %>%
mutate(
value1_converted = Value1/rate,
value2_converted = Value2/rate,
value3_converted = Value3/rate
)
输出