我试图将三重嵌套列表转换为数据帧。 This问题有所帮助,但我无法获得我想要的数据框架。
该列表是从IBrokers获得的选项链,摘要如下所示。我已经上传了更详细的实际链here。
Chain <-
list(
list(
list(
list(version="8",contract=list(symbol="BHP",right="C",expiry="20180621",strike="25")),
list(version="8",contract=list(symbol="BHP",right="C",expiry="20180621",strike="26"))
),
list(
list(version="8",contract=list(symbol="BHP",right="C",expiry="20180730",strike="25")),
list(version="8",contract=list(symbol="BHP",right="C",expiry="20180730",strike="26"))
)
),
list(
list(
list(version="8",contract=list(symbol="CBA",right="C",expiry="20180621",strike="65")),
list(version="8",contract=list(symbol="CBA",right="C",expiry="20180621",strike="64"))
),
list(
list(version="8",contract=list(symbol="CBA",right="C",expiry="20180730",strike="65")),
list(version="8",contract=list(symbol="CBA",right="C",expiry="20180730",strike="64"))
)
)
)
我想将列表转换为如下数据框:
Contracts <- data.frame(symbol=c("BHP","BHP","BHP","BHP","CBA","CBA","CBA","CBA"),
right=c("C","C","C","C","C","C","C","C"),
expiry=c("20180621","20180621","20180730","20180730","20180621","20180621","20180730","20180730"),
strike=c("25","26","25","26","65","64","65","64"))
我尝试了这段代码,但它并没有给我我想要的数据帧。
X <- lapply(Chain,function(x) as.data.frame.list(lapply(x,as.data.frame.list)))
dfx <- do.call(rbind,X)
有什么建议吗?
答案 0 :(得分:2)
以下情况如何?
df <- as.data.frame(matrix(unlist(Chain, recursive = T), ncol = 5, byrow = T)[, -1]);
colnames(df) <- c("symbol", "right", "expiry", "strike");
# symbol right expiry strike
#1 BHP C 20180621 25
#2 BHP C 20180621 26
#3 BHP C 20180730 25
#4 BHP C 20180730 26
#5 CBA C 20180621 65
#6 CBA C 20180621 64
#7 CBA C 20180730 65
#8 CBA C 20180730 64
说明:递归unlist
嵌套Chain
,然后重新转换为matrix
,删除列version
并转换为data.frame
。唯一的小问题是我们必须手动添加列名。
由于您的实际数据非常不同,因此有可能。
注意:我假设Gist中的结构存储在tbl
中。
tbl;
#Source: local data frame [2 x 6]
#Groups: <by row>
#
## A tibble: 2 x 6
# symbol sectype exch currency multiplier Chain
# <fct> <fct> <fct> <fct> <fct> <list>
#1 BHP OPT ASX AUD 100 <list [1,241]>
#2 CBA OPT ASX AUD 100 <list [1,204]>
以下list
包含两个data.frame
,每tbl
行一个。{1}}。
lst <- lapply(tbl$Chain, function(x)
do.call(rbind.data.frame, lapply(x, function(y) as.data.frame(unclass(y$contract)))))
#List of 2
# $ :'data.frame': 1241 obs. of 16 variables:
# ..$ conId : Factor w/ 1241 levels "198440202","198440207",..: 1 2 3 4 5 6 7 8 9 10 ...
# ..$ symbol : Factor w/ 1 level "BHP": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ sectype : Factor w/ 1 level "OPT": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ exch : Factor w/ 1 level "ASX": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ primary : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ expiry : Factor w/ 18 levels "20180628","20181220",..: 1 1 1 1 1 1 1 1 1 1 ...
# ..$ strike : Factor w/ 118 levels "25","26","27",..: 1 1 2 2 3 3 4 4 5 5 ...
# ..$ currency : Factor w/ 1 level "AUD": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ right : Factor w/ 2 levels "C","P": 1 2 1 2 1 2 1 2 1 2 ...
# ..$ local : Factor w/ 1241 levels "BHPV78","BHPV88",..: 1 2 3 4 5 6 7 8 9 10 ...
# ..$ multiplier : Factor w/ 1 level "100": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ combo_legs_desc: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ comboleg : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ include_expired: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ secIdType : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ secId : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# $ :'data.frame': 1204 obs. of 16 variables:
# ..$ conId : Factor w/ 1204 levels "198447027","198447030",..: 1 2 3 4 5 6 7 8 9 10 ...
# ..$ symbol : Factor w/ 1 level "CBA": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ sectype : Factor w/ 1 level "OPT": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ exch : Factor w/ 1 level "ASX": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ primary : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ expiry : Factor w/ 18 levels "20180628","20181220",..: 1 1 1 1 1 1 1 1 1 1 ...
# ..$ strike : Factor w/ 179 levels "79.68","81.68",..: 1 1 2 2 3 3 4 4 5 5 ...
# ..$ currency : Factor w/ 1 level "AUD": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ right : Factor w/ 2 levels "C","P": 1 2 1 2 1 2 1 2 1 2 ...
# ..$ local : Factor w/ 1204 levels "CBAKT9","CBAKU9",..: 1 2 3 4 5 6 7 8 9 10 ...
# ..$ multiplier : Factor w/ 1 level "100": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ combo_legs_desc: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ comboleg : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ include_expired: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ secIdType : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ secId : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
答案 1 :(得分:1)
您可以使用unstack
unstack(data.frame(d<-unlist(Chain),names(d)))
contract.expiry contract.right contract.strike contract.symbol version
1 20180621 C 25 BHP 8
2 20180621 C 26 BHP 8
3 20180730 C 25 BHP 8
4 20180730 C 26 BHP 8
5 20180621 C 65 CBA 8
6 20180621 C 64 CBA 8
7 20180730 C 65 CBA 8
8 20180730 C 64 CBA 8
如果您愿意,可以删除单词contract
。
unstack(data.frame(d<-unlist(Chain),sub(".*[.]","",names(d))))
expiry right strike symbol version
1 20180621 C 25 BHP 8
2 20180621 C 26 BHP 8
3 20180730 C 25 BHP 8
4 20180730 C 26 BHP 8
5 20180621 C 65 CBA 8
6 20180621 C 64 CBA 8
7 20180730 C 65 CBA 8
8 20180730 C 64 CBA 8
这也可以写成unstack(data.frame(d<-unlist(Chain),sub("contract[.]","",names(d))))
虽然我更愿意维护名称合同,以便知道哪些列确实构成了所需的合同数据框
甚至可以在unstacking
之后更改名称。
使用新数据:
a=readLines("https://raw.githubusercontent.com/hughandersen/OptionsTrading/master/Stocks_option_chain")
b=eval(parse(text=paste(a,collapse="")))
s=unstack(data.frame(d<-unlist(b[6]),names(d)))