R - 建立支持的货币"货币对列表

时间:2017-11-16 19:04:26

标签: r

我在R圈中挣扎,我不知道如何解决我的问题。我有一个包含三列的数据框:

  base_currency quote_currency       api_key
1           USD            AUD      USDAUD13
2           USD            CAD      USDCAD58
3           EUR            CNY      EURCNY99
4           EUR            CZK      EURCZK65
5           USD            EUR      USDEUR45
6           JPY            HKD      JPYHKD33
7           JPY            RUB      JPYRUB83

这些都是我拥有数据源以通过API获取汇率的货币对。正如您所看到的,我可以在USD(和向后),AUD USD(以及向后)等转换CAD

我无法直接转换USD中的CNY,但我可以转换USD中的EUR,然后转换EUR CNY我可以使用中间货币对来处理转换。

使用此系统,我可以在AUD中使用CADUSD/AUD对同类地转换USD/CAD。实际上,前5行中的每种货币都可以在同一行中出现的任何货币进行转换。

我的数据框还可能包含与此系统隔离的货币对"例如JPY/HKDJPY/RUB。通过这些货币对,我可以获得HKD / RUB,但就是这样。唯一的方式是第二个"系统"货币对可以链接到第一个货币对是在base_currency列或quote_currency列中共享其中一种货币。

我的目标是定义支持的货币"名单。此列表将包含可以转换为该列表中任何其他货币的货币。

我可以看到我的数据框为这个问题提供了两种解决方案:

[1] "USD" "AUD" "CAD" "EUR" "CNY" "CZK"
[2] "JPY" "HKD" "RUB"

我感兴趣的解决方案是第一个,因为它包含" USD"。

我的真实数据框架包含100多个货币对,有些货币对来自不同的数据源。

为了向您提供有关上下文的更多信息,我使用Shiny构建了一个非常基本的股票投资组合经理:

  1. 在设置中,用户可以指定"投资组合货币"带有下拉项目列表。

  2. 将股票添加到投资组合时,用户必须从类似的下拉项目列表中指定股票的货币。

  3. 我真的想使用那些支持的货币"列表以构建我的下拉菜单,以便在我将货币对添加到数据框时动态更新它们。

    例如,如果我将USD/JPY添加到数据框,我的下拉菜单将显示这些选项:

    "USD" "AUD" "CAD" "EUR" "CNY" "CZK" "JPY" "HKD" "RUB"
    

    这个任务对于我适度的R技能来说似乎太复杂了所以我真的很感激一点帮助。

    非常感谢!

    @Cedric 非常感谢你的回答。我编辑了你的代码以添加额外的假货币对,以检查它是如何反应的,而且有些东西不起作用:

    v<-"base_currency;quote_currency;api_key
    1;USD;AUD;USDAUD13
    2;USD;CAD;USDCAD58
    3;EUR;CNY;EURCNY99
    4;EUR;CZK;EURCZK65
    5;USD;EUR;USDEUR45
    6;JPY;HKD;JPYHKD33
    7;JPY;RUB;JPYRUB83
    8;ALL;AKU;ALLAKU24
    9;AKU;RRR;AKURRR96
    10;KKL;LOI;KKLLOI46"
    
    d<-read.delim(textConnection(v),header=TRUE,sep=";",strip.white=TRUE,stringsAsFactors =F)
    
    
    ## (1) check for values appearing in both columns
    ## those will be linked
    mm <- d$base_currency%in%d$quote_currency | d$quote_currency%in%d$base_currency
    currency_both_sides<-unique(c(d$base_currency[mm],d$quote_currency[mm]))
    ## (2) find remaining (unlinked) matching pairs for those
    d1<-d$base_currency[d$quote_currency%in%currency_both_sides]
    d2<-d$quote_currency[d$base_currency%in%currency_both_sides]
    (common <- unique(c(d1,d2,currency_both_sides)))
    # "EUR" "USD" "ALL" "AKU" "AUD" "CAD" "CNY" "CZK" "RRR"
    ## (3) the other will only appear on one side
    ## Here I'm showing all but in the end it will be every single value,
    ## with all it's matching value in the second column
    ## they will form separate sets
    nn <- !d$base_currency%in%common | !d$quote_currency%in%common
    (onesided<-unique(c(d$base_currency[nn],d$quote_currency[nn])))
    # "JPY" "KKL" "HKD" "RUB" "LOI"
    

    common向量("EUR" "USD" "ALL" "AKU" "AUD" "CAD" "CNY" "CZK" "RRR")包含ALLAKURRR。这三种货币可以相互转换,但不能转换为该货币中的任何其他货币,因此它们不应出现在列表中。你有什么主意吗 ? 再次,非常感谢你的帮助。

    更新 我尝试了一些看似正确的方向:

    v<-"base_currency;quote_currency;api_key
    1;USD;AUD;USDAUD13
    2;USD;CAD;USDCAD58
    3;EUR;CNY;EURCNY99
    4;EUR;CZK;EURCZK65
    5;USD;EUR;USDEUR45
    6;JPY;HKD;JPYHKD33
    7;JPY;RUB;JPYRUB83
    8;ALL;AKU;ALLAKU24
    9;AKU;RRR;AKURRR96
    10;KKL;LOI;KKLLOI46"
    
    d<-read.delim(textConnection(v),header=TRUE,sep=";",strip.white=TRUE,stringsAsFactors =F)
    d
    #   base_currency quote_currency  api_key
    #1            USD            AUD USDAUD13
    #2            USD            CAD USDCAD58
    #3            EUR            CNY EURCNY99
    #4            EUR            CZK EURCZK65
    #5            USD            EUR USDEUR45
    #6            JPY            HKD JPYHKD33
    #7            JPY            RUB JPYRUB83
    #8            ALL            AKU ALLAKU24
    #9            AKU            RRR AKURRR96
    #10           KKL            LOI KKLLOI46
    
    #Select every currency that appears in the dataframe
    all_cur <- c(d$base_currency, d$quote_currency)
    
    #all_cur
    # [1] "USD" "USD" "EUR" "EUR" "USD" "JPY" "JPY" "ALL" "AKU" "KKL" "AUD" "CAD" "CNY" "CZK" "EUR" "HKD" "RUB" "AKU" "RRR" "LOI"
    
    #Select only unique items
    all_cur_unique <- unique(all_cur)
    
    #all_cur_unique
    # [1] "USD" "EUR" "JPY" "ALL" "AKU" "KKL" "AUD" "CAD" "CNY" "CZK" "HKD" "RUB" "RRR" "LOI"
    
    
     #for each unique currency create a vector containing that currency and
     #each currency associated with it in a currency pair
     A <- lapply (as.list(all_cur_unique) , function (i) c(i,subset(d$base_currency, d$quote_currency == i), subset(d$quote_currency, d$base_currency == i)))
    
    A
    #
    #[[1]]
    #[1] "USD" "AUD" "CAD" "EUR"
    #USD group : every currency in this vector can be converted in any other through USD
    #
    #
    #[[2]]
    #[1] "EUR" "USD" "CNY" "CZK"
    #EUR group : every currency in this vector can be converted in any other through EUR
    #
    #
    #[[3]]
    #[1] "JPY" "HKD" "RUB"
    #JPY group : every currency in this vector can be converted in any other through JPY
    #
    #
    #[[4]]
    #[1] "ALL" "AKU"
    #
    #[[5]]
    #[1] "AKU" "ALL" "RRR"
    #
    #[[6]]
    #[1] "KKL" "LOI"
    #
    #[[7]]
    #[1] "AUD" "USD"
    #
    #[[8]]
    #[1] "CAD" "USD"
    #
    #[[9]]
    #[1] "CNY" "EUR"
    #
    #[[10]]
    #[1] "CZK" "EUR"
    #
    #[[11]]
    #[1] "HKD" "JPY"
    #
    #[[12]]
    #[1] "RUB" "JPY"
    #
    #[[13]]
    #[1] "RRR" "AKU"
    #
    #[[14]]
    #[1] "LOI" "KKL"
    

    现在使用这个向量列表我首先需要选择包含&#34; USD&#34;因为美元必须使用支持的货币&#34;,因此我需要这些项目:

    [[1]]
    [1] "USD" "AUD" "CAD" "EUR"
    
    [[2]]
    [1] "EUR" "USD" "CNY" "CZK"
    
    [[7]]
    [1] "AUD" "USD"
    
    [[8]]
    [1] "CAD" "USD"
    

    然后我需要结合这些向量并仅选择唯一的出现,我设法这样做:

    B <- sapply(A, function(x) is.element('USD', x))
    usd_convertible_list <- A[B]
    usd_convertible_vector <- Reduce(c, usd_convertible_list)
    usd_convertible_vector_unique <- unique(usd_convertible_vector)
    usd_convertible_vector_unique
    
    #    "USD" "AUD" "CAD" "EUR" "CNY" "CZK"
    

    然后,对于该向量中的每种货币,我需要再次选择包含该货币的列表中的每个向量:

    for&#34; USD&#34;:

    [[1]]
    [1] "USD" "AUD" "CAD" "EUR"
    
    [[2]]
    [1] "EUR" "USD" "CNY" "CZK"
    
    [[7]]
    [1] "AUD" "USD"
    
    [[8]]
    [1] "CAD" "USD"
    

    for&#34; AUD&#34;:

    [[1]]
    [1] "USD" "AUD" "CAD" "EUR"
    
    [[7]]
    [1] "AUD" "USD"
    

    for&#34; CAD&#34;:

    [[1]]
    [1] "USD" "AUD" "CAD" "EUR"
    
    [[8]]
    [1] "CAD" "USD"
    

    等。对于"USD" "AUD" "CAD" "EUR" "CNY" "CZK"中的每种货币,然后将所有内容组合在一个新的向量中,将该向量与前一个向量进行比较,如果出现新货币,则重复该操作。

    当没有向该向量添加新货币时,这意味着列表已完成并且循环应该停止。以df中提供的货币对为例,列表在第一次运行时就已完成,但如果需要通过多个中间货币对进行转换,我认为这个过程是必需的。 / p>

    例如

    USD    EUR
    EUR    CNY
    CNY    RUB
    RUB    CHF
    

    在这种情况下,即使它看起来并不明显,每种货币都可以转换为任何其他货币。为了实现它,当选择包含USD的第一个向量时,循环需要运行3次。

    我相信这个过程应该给我支持的货币&#34;我正在寻找但我很难将其变成代码......

1 个答案:

答案 0 :(得分:0)

v<-"a;b;c
    1;USD;AUD;USDAUD13
    2;USD;CAD;USDCAD58
    3;EUR;CNY;EURCNY99
    4;EUR;CZK;EURCZK65
    5;USD;EUR;USDEUR45
    6;JPY;HKD;JPYHKD33
    7;JPY;RUB;JPYRUB83
    8;ALL;AKU;ALLAKU24
    9;AKU;RRR;AKURRR96
    10;KKL;LOI;KKLLOI46"
d<-read.delim(textConnection(v),header=TRUE,sep=";",strip.white=TRUE,stringsAsFactors=FALSE)
d<-d[,-3] # not needed
e<-d[,c(2,1)]; colnames(e)<-colnames(d)
f<-rbind(d,e) # since you can run both one way or the other, I create a data
# frame mixing to and fro
require(dplyr)
# this function will left join the df with itself using first and last 
# column
# at some point some lines will produce NA (no matching values)
# we will not join using those values, so I'm splitting the dataframe
# in two and working only with the one without NA in last column
my_left_join <-function(df){
  aa <- first(colnames(df))
  cc <- last(colnames(df))  
  df0 <- df[is.na(df[,ncol(df)]),] # we will not join NA
  df1 <- df[!is.na(df[,ncol(df)]),]
  df1 <- left_join(df1,df1[,c(1,ncol(df1))],by=setNames(aa,cc))
  df0[,last(colnames(df1))]<-rep(NA,nrow(df0))
  df2 <- rbind(df0,df1)
}
(g<-my_left_join(f))
#a   b b.y
#1  USD AUD USD
#2  USD CAD USD
#3  EUR CNY EUR
#4  EUR CZK EUR
#5  USD EUR CNY
#6  USD EUR CZK
#7  USD EUR USD
#8  JPY HKD JPY
#9  JPY RUB JPY
#10 ALL AKU RRR
#11 ALL AKU ALL
# here we see that we might run into loops, so let's remove values already in line
remove_duplicates_inrow <- function(df) {
  df[,ncol(df)]<-apply(df,1,function(X){
        if (X[length(X)]%in%X[1:(length(X)-1)])  X[length(X)]<-NA 
        return( X[length(X)])
      })
  return(df[order(df[ncol(df)]),])
}
(h<-ee(g))
#a   b  b.y
#35 RRR AKU  ALL
#17 CAD USD  AUD
#26 EUR USD  AUD
#15 AUD USD  CAD
#27 EUR USD  CAD
#5  USD EUR  CNY
#23 CZK EUR  CNY
#6  USD EUR  CZK
#21 CNY EUR  CZK
#16 AUD USD  EUR
#19 CAD USD  EUR
#31 RUB JPY  HKD
#10 ALL AKU  RRR
#30 HKD JPY  RUB
#22 CNY EUR  USD
#25 CZK EUR  USD
#1  USD AUD <NA>
#2  USD CAD <NA>
# this function will recursive left join untill there is no matching
# due to the way it is built I have to remove the last two columns
recursive_join <-function (df){
  #print(df)
  #browser()
  df <- my_left_join(df)
  df <- remove_duplicates_inrow(df)
  if (all(is.na(df[,ncol(df)]))){
    return(df[order(df[ncol(df)]),-ncol(df)])
  } else {
    recursive_join(df)
  }
}

i<-recursive_join(f)
# everything is a mix, I sort by row and by col to obtain the right order
# order by row
i<-t(apply(i,1,function(X)X[order(X)]))
# order by all columns, note this is a problem as we don't know in advance
# the number of columns, I have asked a question regarding this.
i<-i[order(i[,1],i[,2],i[,3],i[,4]),]

后者假设我们只有4列,我已经发布了一个问题 here如果列数未知,请询问如何执行此操作。 在适应的答案之下:

col=""
for (j in 1:ncol(i)){
  col <- paste(col,paste0( 'i[,',j,']' ), sep = "," )
}
## remove first comma
col <- substr(col,2,nchar(col))
i <- eval(parse(text= paste("A[order(",col,",decreasing=TRUE),]")))    



# now we have duplicated 
i<-i[!duplicated(i),]
# OK these duplicates were the easy ones, but we have vectors of different 
length, lets remove vector that are contained in longer vectors  

res<-matrix(i[1, ],1,ncol(i))
for (l in 2:nrow(i)){      
  # comparing line with last in res but remove NA
  # as we have sorted data this works !
  if (!all(i[l,][!is.na(i[l,])]
  %in%
  res[nrow(res),][!is.na(res[nrow(res),])])){    
    res<-rbind(res,i[l,]) 
  }  
}
res
#[,1]  [,2]  [,3]  [,4] 
#[1,] "AKU" "ALL" "RRR" NA   
#[2,] "AUD" "CAD" "EUR" "USD"
#[3,] "CNY" "CZK" "EUR" "USD"
#[4,] "HKD" "JPY" "RUB" NA   
#[5,] "KKL" "LOI" NA    NA