I have this list:
dput(data)
structure(list(open = structure(c(NA, 135.600006, 136.759995), .Dim = c(3L,
1L), .Dimnames = list(structure(c("2016-01-01", "2016-01-04",
"2016-01-05"), .Dim = c(3L, 1L)), "IBM")), high = structure(c(NA,
135.970001, 136.889999), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01",
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), low = structure(c(NA,
134.240005, 134.850006), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01",
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), close = structure(c(NA,
135.949997, 135.850006), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01",
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), volume = structure(c(NA,
5229400L, 3924800L), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01",
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), adj.close = structure(c(NA,
130.959683, 130.863362), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01",
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM"))), .Names = c("open",
"high", "low", "close", "volume", "adj.close"))
我正在尝试将此列表转换为数据框,以便我可以进行更多计算。
我需要这个数据框看起来像这样:
Date Open High Low Close Volume
1985-01-02 3.18 3.18 3.08 3.08 1870906
我试过这个:
do.call(rbind, data)
无法看到列?有什么想法吗?
答案 0 :(得分:2)
我会发表评论作为答案:
val dist = df.select("name", "animal").rdd.collect.map {
case Row(name: String, animal: String) => (name, animal)
}
for {
(name, animal) <- dist
} df.where($"name" === name && $"animal" === animal)
.select($"data").write.format("csv").save(s"/prefix/$name/$animal")
基本上,我们使用data2 <- setNames(do.call('cbind.data.frame', data), names(data))
data2$date <- row.names(data2)
row.names(data2) <- NULL
data2 <- cbind.data.frame(date = data2$date, data2[,-7])
date open high low close volume adj.close
1 2016-01-01 NA NA NA NA NA NA
2 2016-01-04 135.60 135.97 134.24 135.95 5229400 130.9597
3 2016-01-05 136.76 136.89 134.85 135.85 3924800 130.8634
而不是cbind.data.frame
来接近我们想要的内容。从那里开始重新组织rbind
答案 1 :(得分:0)
您可以使用for
循环合并此列表。我使用as.numeric
删除每个矩阵的行和列名:
list.to.df <- structure(list(open = structure(c(NA, 135.600006, 136.759995), .Dim = c(3L,
1L), .Dimnames = list(structure(c("2016-01-01", "2016-01-04",
"2016-01-05"), .Dim = c(3L, 1L)), "IBM")), high = structure(c(NA,
135.970001, 136.889999), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01",
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), low = structure(c(NA,
134.240005, 134.850006), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01",
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), close = structure(c(NA,
135.949997, 135.850006), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01",
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), volume = structure(c(NA,
5229400L, 3924800L), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01",
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), adj.close = structure(c(NA,
130.959683, 130.863362), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01",
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM"))), .Names = c("open",
"high", "low", "close", "volume", "adj.close"))
names <- names(list.to.df)
df <- data.frame(Date=as.Date(row.names(list.to.df[[1]])))
for(i in 1:length(list.to.df)){
df[,i + 1] <- as.numeric(list.to.df[[i]])
names(df)[i + 1] <- names[i]
}
Date open high low close volume adj.close
1 2016-01-01 NA NA NA NA NA NA
2 2016-01-04 135.60 135.97 134.24 135.95 5229400 130.9597
3 2016-01-05 136.76 136.89 134.85 135.85 3924800 130.8634