我在R圈中挣扎,我不知道如何解决我的问题。我有一个包含三列的数据框:
base_currency quote_currency api_key
1 USD AUD USDAUD13
2 USD CAD USDCAD58
3 EUR CNY EURCNY99
4 EUR CZK EURCZK65
5 USD EUR USDEUR45
6 JPY HKD JPYHKD33
7 JPY RUB JPYRUB83
这些都是我拥有数据源以通过API获取汇率的货币对。正如您所看到的,我可以在USD
(和向后),AUD
USD
(以及向后)等转换CAD
。
我无法直接转换USD
中的CNY
,但我可以转换USD
中的EUR
,然后转换EUR
CNY
我可以使用中间货币对来处理转换。
使用此系统,我可以在AUD
中使用CAD
和USD/AUD
对同类地转换USD/CAD
。实际上,前5行中的每种货币都可以在同一行中出现的任何货币进行转换。
我的数据框还可能包含与此系统隔离的货币对"例如JPY/HKD
和JPY/RUB
。通过这些货币对,我可以获得HKD / RUB,但就是这样。唯一的方式是第二个"系统"货币对可以链接到第一个货币对是在base_currency
列或quote_currency
列中共享其中一种货币。
我的目标是定义支持的货币"名单。此列表将包含可以转换为该列表中任何其他货币的货币。
我可以看到我的数据框为这个问题提供了两种解决方案:
[1] "USD" "AUD" "CAD" "EUR" "CNY" "CZK"
[2] "JPY" "HKD" "RUB"
我感兴趣的解决方案是第一个,因为它包含" USD"。
我的真实数据框架包含100多个货币对,有些货币对来自不同的数据源。
为了向您提供有关上下文的更多信息,我使用Shiny构建了一个非常基本的股票投资组合经理:
在设置中,用户可以指定"投资组合货币"带有下拉项目列表。
将股票添加到投资组合时,用户必须从类似的下拉项目列表中指定股票的货币。
我真的想使用那些支持的货币"列表以构建我的下拉菜单,以便在我将货币对添加到数据框时动态更新它们。
例如,如果我将USD/JPY
添加到数据框,我的下拉菜单将显示这些选项:
"USD" "AUD" "CAD" "EUR" "CNY" "CZK" "JPY" "HKD" "RUB"
这个任务对于我适度的R技能来说似乎太复杂了所以我真的很感激一点帮助。
非常感谢!
@Cedric 非常感谢你的回答。我编辑了你的代码以添加额外的假货币对,以检查它是如何反应的,而且有些东西不起作用:
v<-"base_currency;quote_currency;api_key
1;USD;AUD;USDAUD13
2;USD;CAD;USDCAD58
3;EUR;CNY;EURCNY99
4;EUR;CZK;EURCZK65
5;USD;EUR;USDEUR45
6;JPY;HKD;JPYHKD33
7;JPY;RUB;JPYRUB83
8;ALL;AKU;ALLAKU24
9;AKU;RRR;AKURRR96
10;KKL;LOI;KKLLOI46"
d<-read.delim(textConnection(v),header=TRUE,sep=";",strip.white=TRUE,stringsAsFactors =F)
## (1) check for values appearing in both columns
## those will be linked
mm <- d$base_currency%in%d$quote_currency | d$quote_currency%in%d$base_currency
currency_both_sides<-unique(c(d$base_currency[mm],d$quote_currency[mm]))
## (2) find remaining (unlinked) matching pairs for those
d1<-d$base_currency[d$quote_currency%in%currency_both_sides]
d2<-d$quote_currency[d$base_currency%in%currency_both_sides]
(common <- unique(c(d1,d2,currency_both_sides)))
# "EUR" "USD" "ALL" "AKU" "AUD" "CAD" "CNY" "CZK" "RRR"
## (3) the other will only appear on one side
## Here I'm showing all but in the end it will be every single value,
## with all it's matching value in the second column
## they will form separate sets
nn <- !d$base_currency%in%common | !d$quote_currency%in%common
(onesided<-unique(c(d$base_currency[nn],d$quote_currency[nn])))
# "JPY" "KKL" "HKD" "RUB" "LOI"
common
向量("EUR" "USD" "ALL" "AKU" "AUD" "CAD" "CNY" "CZK" "RRR"
)包含ALL
,AKU
和RRR
。这三种货币可以相互转换,但不能转换为该货币中的任何其他货币,因此它们不应出现在列表中。你有什么主意吗 ?
再次,非常感谢你的帮助。
更新 我尝试了一些看似正确的方向:
v<-"base_currency;quote_currency;api_key
1;USD;AUD;USDAUD13
2;USD;CAD;USDCAD58
3;EUR;CNY;EURCNY99
4;EUR;CZK;EURCZK65
5;USD;EUR;USDEUR45
6;JPY;HKD;JPYHKD33
7;JPY;RUB;JPYRUB83
8;ALL;AKU;ALLAKU24
9;AKU;RRR;AKURRR96
10;KKL;LOI;KKLLOI46"
d<-read.delim(textConnection(v),header=TRUE,sep=";",strip.white=TRUE,stringsAsFactors =F)
d
# base_currency quote_currency api_key
#1 USD AUD USDAUD13
#2 USD CAD USDCAD58
#3 EUR CNY EURCNY99
#4 EUR CZK EURCZK65
#5 USD EUR USDEUR45
#6 JPY HKD JPYHKD33
#7 JPY RUB JPYRUB83
#8 ALL AKU ALLAKU24
#9 AKU RRR AKURRR96
#10 KKL LOI KKLLOI46
#Select every currency that appears in the dataframe
all_cur <- c(d$base_currency, d$quote_currency)
#all_cur
# [1] "USD" "USD" "EUR" "EUR" "USD" "JPY" "JPY" "ALL" "AKU" "KKL" "AUD" "CAD" "CNY" "CZK" "EUR" "HKD" "RUB" "AKU" "RRR" "LOI"
#Select only unique items
all_cur_unique <- unique(all_cur)
#all_cur_unique
# [1] "USD" "EUR" "JPY" "ALL" "AKU" "KKL" "AUD" "CAD" "CNY" "CZK" "HKD" "RUB" "RRR" "LOI"
#for each unique currency create a vector containing that currency and
#each currency associated with it in a currency pair
A <- lapply (as.list(all_cur_unique) , function (i) c(i,subset(d$base_currency, d$quote_currency == i), subset(d$quote_currency, d$base_currency == i)))
A
#
#[[1]]
#[1] "USD" "AUD" "CAD" "EUR"
#USD group : every currency in this vector can be converted in any other through USD
#
#
#[[2]]
#[1] "EUR" "USD" "CNY" "CZK"
#EUR group : every currency in this vector can be converted in any other through EUR
#
#
#[[3]]
#[1] "JPY" "HKD" "RUB"
#JPY group : every currency in this vector can be converted in any other through JPY
#
#
#[[4]]
#[1] "ALL" "AKU"
#
#[[5]]
#[1] "AKU" "ALL" "RRR"
#
#[[6]]
#[1] "KKL" "LOI"
#
#[[7]]
#[1] "AUD" "USD"
#
#[[8]]
#[1] "CAD" "USD"
#
#[[9]]
#[1] "CNY" "EUR"
#
#[[10]]
#[1] "CZK" "EUR"
#
#[[11]]
#[1] "HKD" "JPY"
#
#[[12]]
#[1] "RUB" "JPY"
#
#[[13]]
#[1] "RRR" "AKU"
#
#[[14]]
#[1] "LOI" "KKL"
现在使用这个向量列表我首先需要选择包含&#34; USD&#34;因为美元必须使用支持的货币&#34;,因此我需要这些项目:
[[1]]
[1] "USD" "AUD" "CAD" "EUR"
[[2]]
[1] "EUR" "USD" "CNY" "CZK"
[[7]]
[1] "AUD" "USD"
[[8]]
[1] "CAD" "USD"
然后我需要结合这些向量并仅选择唯一的出现,我设法这样做:
B <- sapply(A, function(x) is.element('USD', x))
usd_convertible_list <- A[B]
usd_convertible_vector <- Reduce(c, usd_convertible_list)
usd_convertible_vector_unique <- unique(usd_convertible_vector)
usd_convertible_vector_unique
# "USD" "AUD" "CAD" "EUR" "CNY" "CZK"
然后,对于该向量中的每种货币,我需要再次选择包含该货币的列表中的每个向量:
for&#34; USD&#34;:
[[1]]
[1] "USD" "AUD" "CAD" "EUR"
[[2]]
[1] "EUR" "USD" "CNY" "CZK"
[[7]]
[1] "AUD" "USD"
[[8]]
[1] "CAD" "USD"
for&#34; AUD&#34;:
[[1]]
[1] "USD" "AUD" "CAD" "EUR"
[[7]]
[1] "AUD" "USD"
for&#34; CAD&#34;:
[[1]]
[1] "USD" "AUD" "CAD" "EUR"
[[8]]
[1] "CAD" "USD"
等。对于"USD" "AUD" "CAD" "EUR" "CNY" "CZK"
中的每种货币,然后将所有内容组合在一个新的向量中,将该向量与前一个向量进行比较,如果出现新货币,则重复该操作。
当没有向该向量添加新货币时,这意味着列表已完成并且循环应该停止。以df中提供的货币对为例,列表在第一次运行时就已完成,但如果需要通过多个中间货币对进行转换,我认为这个过程是必需的。 / p>
例如
USD EUR
EUR CNY
CNY RUB
RUB CHF
在这种情况下,即使它看起来并不明显,每种货币都可以转换为任何其他货币。为了实现它,当选择包含USD的第一个向量时,循环需要运行3次。
我相信这个过程应该给我支持的货币&#34;我正在寻找但我很难将其变成代码......
答案 0 :(得分:0)
v<-"a;b;c
1;USD;AUD;USDAUD13
2;USD;CAD;USDCAD58
3;EUR;CNY;EURCNY99
4;EUR;CZK;EURCZK65
5;USD;EUR;USDEUR45
6;JPY;HKD;JPYHKD33
7;JPY;RUB;JPYRUB83
8;ALL;AKU;ALLAKU24
9;AKU;RRR;AKURRR96
10;KKL;LOI;KKLLOI46"
d<-read.delim(textConnection(v),header=TRUE,sep=";",strip.white=TRUE,stringsAsFactors=FALSE)
d<-d[,-3] # not needed
e<-d[,c(2,1)]; colnames(e)<-colnames(d)
f<-rbind(d,e) # since you can run both one way or the other, I create a data
# frame mixing to and fro
require(dplyr)
# this function will left join the df with itself using first and last
# column
# at some point some lines will produce NA (no matching values)
# we will not join using those values, so I'm splitting the dataframe
# in two and working only with the one without NA in last column
my_left_join <-function(df){
aa <- first(colnames(df))
cc <- last(colnames(df))
df0 <- df[is.na(df[,ncol(df)]),] # we will not join NA
df1 <- df[!is.na(df[,ncol(df)]),]
df1 <- left_join(df1,df1[,c(1,ncol(df1))],by=setNames(aa,cc))
df0[,last(colnames(df1))]<-rep(NA,nrow(df0))
df2 <- rbind(df0,df1)
}
(g<-my_left_join(f))
#a b b.y
#1 USD AUD USD
#2 USD CAD USD
#3 EUR CNY EUR
#4 EUR CZK EUR
#5 USD EUR CNY
#6 USD EUR CZK
#7 USD EUR USD
#8 JPY HKD JPY
#9 JPY RUB JPY
#10 ALL AKU RRR
#11 ALL AKU ALL
# here we see that we might run into loops, so let's remove values already in line
remove_duplicates_inrow <- function(df) {
df[,ncol(df)]<-apply(df,1,function(X){
if (X[length(X)]%in%X[1:(length(X)-1)]) X[length(X)]<-NA
return( X[length(X)])
})
return(df[order(df[ncol(df)]),])
}
(h<-ee(g))
#a b b.y
#35 RRR AKU ALL
#17 CAD USD AUD
#26 EUR USD AUD
#15 AUD USD CAD
#27 EUR USD CAD
#5 USD EUR CNY
#23 CZK EUR CNY
#6 USD EUR CZK
#21 CNY EUR CZK
#16 AUD USD EUR
#19 CAD USD EUR
#31 RUB JPY HKD
#10 ALL AKU RRR
#30 HKD JPY RUB
#22 CNY EUR USD
#25 CZK EUR USD
#1 USD AUD <NA>
#2 USD CAD <NA>
# this function will recursive left join untill there is no matching
# due to the way it is built I have to remove the last two columns
recursive_join <-function (df){
#print(df)
#browser()
df <- my_left_join(df)
df <- remove_duplicates_inrow(df)
if (all(is.na(df[,ncol(df)]))){
return(df[order(df[ncol(df)]),-ncol(df)])
} else {
recursive_join(df)
}
}
i<-recursive_join(f)
# everything is a mix, I sort by row and by col to obtain the right order
# order by row
i<-t(apply(i,1,function(X)X[order(X)]))
# order by all columns, note this is a problem as we don't know in advance
# the number of columns, I have asked a question regarding this.
i<-i[order(i[,1],i[,2],i[,3],i[,4]),]
后者假设我们只有4列,我已经发布了一个问题 here如果列数未知,请询问如何执行此操作。 在适应的答案之下:
col=""
for (j in 1:ncol(i)){
col <- paste(col,paste0( 'i[,',j,']' ), sep = "," )
}
## remove first comma
col <- substr(col,2,nchar(col))
i <- eval(parse(text= paste("A[order(",col,",decreasing=TRUE),]")))
# now we have duplicated
i<-i[!duplicated(i),]
# OK these duplicates were the easy ones, but we have vectors of different
length, lets remove vector that are contained in longer vectors
res<-matrix(i[1, ],1,ncol(i))
for (l in 2:nrow(i)){
# comparing line with last in res but remove NA
# as we have sorted data this works !
if (!all(i[l,][!is.na(i[l,])]
%in%
res[nrow(res),][!is.na(res[nrow(res),])])){
res<-rbind(res,i[l,])
}
}
res
#[,1] [,2] [,3] [,4]
#[1,] "AKU" "ALL" "RRR" NA
#[2,] "AUD" "CAD" "EUR" "USD"
#[3,] "CNY" "CZK" "EUR" "USD"
#[4,] "HKD" "JPY" "RUB" NA
#[5,] "KKL" "LOI" NA NA