将三重嵌套列表转换为数据帧

时间:2018-04-29 21:47:28

标签: r nested-lists ibrokers

我试图将三重嵌套列表转换为数据帧。 This问题有所帮助,但我无法获得我想要的数据框架。

该列表是从IBrokers获得的选项链,摘要如下所示。我已经上传了更详细的实际链here

Chain <- 
  list(
    list(
      list(
        list(version="8",contract=list(symbol="BHP",right="C",expiry="20180621",strike="25")),
        list(version="8",contract=list(symbol="BHP",right="C",expiry="20180621",strike="26"))
      ),
      list(
        list(version="8",contract=list(symbol="BHP",right="C",expiry="20180730",strike="25")),
        list(version="8",contract=list(symbol="BHP",right="C",expiry="20180730",strike="26"))
      )
    ),
    list(
      list(
        list(version="8",contract=list(symbol="CBA",right="C",expiry="20180621",strike="65")),
        list(version="8",contract=list(symbol="CBA",right="C",expiry="20180621",strike="64"))
      ),
      list(
        list(version="8",contract=list(symbol="CBA",right="C",expiry="20180730",strike="65")),
        list(version="8",contract=list(symbol="CBA",right="C",expiry="20180730",strike="64"))
      )
    )
  )

我想将列表转换为如下数据框:

Contracts <- data.frame(symbol=c("BHP","BHP","BHP","BHP","CBA","CBA","CBA","CBA"),
                        right=c("C","C","C","C","C","C","C","C"),
                        expiry=c("20180621","20180621","20180730","20180730","20180621","20180621","20180730","20180730"),
                        strike=c("25","26","25","26","65","64","65","64"))

我尝试了这段代码,但它并没有给我我想要的数据帧。

X <- lapply(Chain,function(x) as.data.frame.list(lapply(x,as.data.frame.list)))
dfx <- do.call(rbind,X)

有什么建议吗?

2 个答案:

答案 0 :(得分:2)

以下情况如何?

df <- as.data.frame(matrix(unlist(Chain, recursive = T), ncol = 5, byrow = T)[, -1]);
colnames(df) <- c("symbol", "right", "expiry", "strike");
#  symbol right   expiry strike
#1    BHP     C 20180621     25
#2    BHP     C 20180621     26
#3    BHP     C 20180730     25
#4    BHP     C 20180730     26
#5    CBA     C 20180621     65
#6    CBA     C 20180621     64
#7    CBA     C 20180730     65
#8    CBA     C 20180730     64

说明:递归unlist嵌套Chain,然后重新转换为matrix,删除列version并转换为data.frame。唯一的小问题是我们必须手动添加列名。

更新

由于您的实际数据非常不同,因此有可能。 注意:我假设Gist中的结构存储在tbl 中。

tbl;
#Source: local data frame [2 x 6]
#Groups: <by row>
#
## A tibble: 2 x 6
#  symbol sectype exch  currency multiplier Chain
#  <fct>  <fct>   <fct> <fct>    <fct>      <list>
#1 BHP    OPT     ASX   AUD      100        <list [1,241]>
#2 CBA    OPT     ASX   AUD      100        <list [1,204]>

以下list包含两个data.frame,每tbl行一个。{1}}。

lst <- lapply(tbl$Chain, function(x)
    do.call(rbind.data.frame, lapply(x, function(y) as.data.frame(unclass(y$contract)))))
#List of 2
# $ :'data.frame':  1241 obs. of  16 variables:
#  ..$ conId          : Factor w/ 1241 levels "198440202","198440207",..: 1 2 3 4 5 6 7 8 9 10 ...
#  ..$ symbol         : Factor w/ 1 level "BHP": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ sectype        : Factor w/ 1 level "OPT": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ exch           : Factor w/ 1 level "ASX": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ primary        : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ expiry         : Factor w/ 18 levels "20180628","20181220",..: 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ strike         : Factor w/ 118 levels "25","26","27",..: 1 1 2 2 3 3 4 4 5 5 ...
#  ..$ currency       : Factor w/ 1 level "AUD": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ right          : Factor w/ 2 levels "C","P": 1 2 1 2 1 2 1 2 1 2 ...
#  ..$ local          : Factor w/ 1241 levels "BHPV78","BHPV88",..: 1 2 3 4 5 6 7 8 9 10 ...
#  ..$ multiplier     : Factor w/ 1 level "100": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ combo_legs_desc: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ comboleg       : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ include_expired: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ secIdType      : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ secId          : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# $ :'data.frame':  1204 obs. of  16 variables:
#  ..$ conId          : Factor w/ 1204 levels "198447027","198447030",..: 1 2 3 4 5 6 7 8 9 10 ...
#  ..$ symbol         : Factor w/ 1 level "CBA": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ sectype        : Factor w/ 1 level "OPT": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ exch           : Factor w/ 1 level "ASX": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ primary        : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ expiry         : Factor w/ 18 levels "20180628","20181220",..: 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ strike         : Factor w/ 179 levels "79.68","81.68",..: 1 1 2 2 3 3 4 4 5 5 ...
#  ..$ currency       : Factor w/ 1 level "AUD": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ right          : Factor w/ 2 levels "C","P": 1 2 1 2 1 2 1 2 1 2 ...
#  ..$ local          : Factor w/ 1204 levels "CBAKT9","CBAKU9",..: 1 2 3 4 5 6 7 8 9 10 ...
#  ..$ multiplier     : Factor w/ 1 level "100": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ combo_legs_desc: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ comboleg       : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ include_expired: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ secIdType      : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
#  ..$ secId          : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...

答案 1 :(得分:1)

您可以使用unstack

 unstack(data.frame(d<-unlist(Chain),names(d)))
  contract.expiry contract.right contract.strike contract.symbol version
1        20180621              C              25             BHP       8
2        20180621              C              26             BHP       8
3        20180730              C              25             BHP       8
4        20180730              C              26             BHP       8
5        20180621              C              65             CBA       8
6        20180621              C              64             CBA       8
7        20180730              C              65             CBA       8
8        20180730              C              64             CBA       8

如果您愿意,可以删除单词contract

unstack(data.frame(d<-unlist(Chain),sub(".*[.]","",names(d))))
    expiry right strike symbol version
1 20180621     C     25    BHP       8
2 20180621     C     26    BHP       8
3 20180730     C     25    BHP       8
4 20180730     C     26    BHP       8
5 20180621     C     65    CBA       8
6 20180621     C     64    CBA       8
7 20180730     C     65    CBA       8
8 20180730     C     64    CBA       8

这也可以写成unstack(data.frame(d<-unlist(Chain),sub("contract[.]","",names(d))))虽然我更愿意维护名称合同,以便知道哪些列确实构成了所需的合同数据框

甚至可以在unstacking之后更改名称。

使用新数据:

a=readLines("https://raw.githubusercontent.com/hughandersen/OptionsTrading/master/Stocks_option_chain")
b=eval(parse(text=paste(a,collapse="")))
s=unstack(data.frame(d<-unlist(b[6]),names(d)))