Question

I have this list:

dput(data)

structure(list(open = structure(c(NA, 135.600006, 136.759995), .Dim = c(3L, 
1L), .Dimnames = list(structure(c("2016-01-01", "2016-01-04", 
"2016-01-05"), .Dim = c(3L, 1L)), "IBM")), high = structure(c(NA, 
135.970001, 136.889999), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01", 
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), low = structure(c(NA, 
134.240005, 134.850006), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01", 
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), close = structure(c(NA, 
135.949997, 135.850006), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01", 
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), volume = structure(c(NA, 
5229400L, 3924800L), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01", 
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), adj.close = structure(c(NA, 
130.959683, 130.863362), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01", 
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM"))), .Names = c("open", 
"high", "low", "close", "volume", "adj.close"))

我正在尝试将此列表转换为数据框，以便我可以进行更多计算。

我需要这个数据框看起来像这样：

Date Open High  Low Close  Volume
1985-01-02 3.18 3.18 3.08  3.08 1870906

我试过这个：

do.call(rbind, data)

无法看到列？有什么想法吗？

Answer 1

我会发表评论作为答案：

val dist = df.select("name", "animal").rdd.collect.map {
  case Row(name: String, animal: String) => (name, animal)
}

for {
  (name, animal) <- dist
} df.where($"name" === name && $"animal" === animal)
    .select($"data").write.format("csv").save(s"/prefix/$name/$animal")

基本上，我们使用data2 <- setNames(do.call('cbind.data.frame', data), names(data)) data2$date <- row.names(data2) row.names(data2) <- NULL data2 <- cbind.data.frame(date = data2$date, data2[,-7]) date open high low close volume adj.close 1 2016-01-01 NA NA NA NA NA NA 2 2016-01-04 135.60 135.97 134.24 135.95 5229400 130.9597 3 2016-01-05 136.76 136.89 134.85 135.85 3924800 130.8634而不是cbind.data.frame来接近我们想要的内容。从那里开始重新组织rbind

Answer 2

您可以使用for循环合并此列表。我使用as.numeric删除每个矩阵的行和列名：

list.to.df <- structure(list(open = structure(c(NA, 135.600006, 136.759995), .Dim = c(3L, 
1L), .Dimnames = list(structure(c("2016-01-01", "2016-01-04", 
"2016-01-05"), .Dim = c(3L, 1L)), "IBM")), high = structure(c(NA, 
135.970001, 136.889999), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01", 
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), low = structure(c(NA, 
134.240005, 134.850006), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01", 
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), close = structure(c(NA, 
135.949997, 135.850006), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01", 
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), volume = structure(c(NA, 
5229400L, 3924800L), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01", 
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM")), adj.close = structure(c(NA, 
130.959683, 130.863362), .Dim = c(3L, 1L), .Dimnames = list(structure(c("2016-01-01", 
"2016-01-04", "2016-01-05"), .Dim = c(3L, 1L)), "IBM"))), .Names = c("open", 
"high", "low", "close", "volume", "adj.close"))

names <- names(list.to.df)

df <- data.frame(Date=as.Date(row.names(list.to.df[[1]])))

for(i in 1:length(list.to.df)){
  df[,i + 1] <- as.numeric(list.to.df[[i]])
  names(df)[i + 1] <- names[i]
}

        Date   open   high    low  close  volume adj.close
1 2016-01-01     NA     NA     NA     NA      NA        NA
2 2016-01-04 135.60 135.97 134.24 135.95 5229400  130.9597
3 2016-01-05 136.76 136.89 134.85 135.85 3924800  130.8634

将列表转换为R中的数据框

2 个答案: