以下是我可以在您的控制台中运行的列表(请告诉我,如果它出于示例目的而过长,我可以修改它):
my_list = list(structure(list(PX_LAST = c(0.398, 0.457, 0.4, 0.159, 0.126,
0.108, 0.26, 0.239, 0.222, 0.191, 0.184)), .Names = "PX_LAST", row.names = c("2014-04-28 00:00:00",
"2014-04-29 00:00:00", "2014-04-30 00:00:00", "2014-05-02 00:00:00",
"2014-05-05 00:00:00", "2014-05-06 00:00:00", "2014-05-07 00:00:00",
"2014-05-08 00:00:00", "2014-05-09 00:00:00", "2014-05-12 00:00:00",
"2014-05-13 00:00:00"), class = "data.frame"), structure(list(
PX_LAST = c(1.731, 1.706, 1.7095, 1.69, 1.713, 1.711, 1.724,
1.699, 1.702, 1.705, 1.649, 1.611)), .Names = "PX_LAST", row.names = c("2014-04-29 00:00:00",
"2014-04-30 00:00:00", "2014-05-01 00:00:00", "2014-05-02 00:00:00",
"2014-05-05 00:00:00", "2014-05-06 00:00:00", "2014-05-07 00:00:00",
"2014-05-08 00:00:00", "2014-05-09 00:00:00", "2014-05-12 00:00:00",
"2014-05-13 00:00:00", "2014-05-14 00:00:00"), class = "data.frame"),
structure(list(PX_LAST = c(0.481, 0.456, 0.448, 0.439, 0.436,
0.448, 0.458, 0.466, 0.432, 0.437, 0.441, 0.417, 0.4035)), .Names = "PX_LAST", row.names = c("2014-04-28 00:00:00",
"2014-04-29 00:00:00", "2014-04-30 00:00:00", "2014-05-01 00:00:00",
"2014-05-02 00:00:00", "2014-05-05 00:00:00", "2014-05-06 00:00:00",
"2014-05-07 00:00:00", "2014-05-08 00:00:00", "2014-05-09 00:00:00",
"2014-05-12 00:00:00", "2014-05-13 00:00:00", "2014-05-14 00:00:00"
), class = "data.frame"), structure(list(PX_LAST = c(1.65,
1.65, 1.64, 1.65, 1.662, 1.6595, 1.665, 1.6595, 1.6625, 1.652,
1.645, 1.6245, 1.627, 1.633)), .Names = "PX_LAST", row.names = c("2014-04-25 00:00:00",
"2014-04-28 00:00:00", "2014-04-29 00:00:00", "2014-04-30 00:00:00",
"2014-05-01 00:00:00", "2014-05-02 00:00:00", "2014-05-05 00:00:00",
"2014-05-06 00:00:00", "2014-05-07 00:00:00", "2014-05-08 00:00:00",
"2014-05-09 00:00:00", "2014-05-12 00:00:00", "2014-05-13 00:00:00",
"2014-05-14 00:00:00"), class = "data.frame"))
我的问题是:如何在该列表中使用do.call()
根据日期合并所有数据?
考虑我无法管理的merge
和cbind
返回错误:
> do.call(what = merge, args = my_list)
Error in fix.by(by.x, x) :
'by' must specify column(s) as numbers, names or logical
> do.call(what = cbind, args = my_list)
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 11, 12, 13, 14
我想得到一个单一的数据矩阵(可能缺少/不匹配的数据被NA
替换)等于merge()
对{{1}的元素的影响}}
答案 0 :(得分:2)
如果您没有按行名称合并,这会更容易一些,但您可以使用Reduce
函数执行此操作,该函数将按顺序在值列表中应用函数(在本例中为data.frames`尝试
Reduce(function(x,y) {
dd<-merge(x,y,by=0); rownames(dd)<-dd$Row.names; dd[-1]
}, my_list)
这将合并所有匹配的行。如果您愿意,也可以将all=T
添加到匹配项中,或者如果您使用常规merge()
,则可以自定义。
您会收到有关列名称的警告,因为您的每个列都具有相同的名称,因此当您合并到多个列时,merge
并不知道您为它们命名的内容。你可以用
my_new_list <- Map(
function(x,n) {
names(x)<-n; x
},
my_list,
paste("PX_LAST",1:length(my_list), sep="_")
)
然后
Reduce(function(x,y) {
dd<-merge(x,y,by=0); rownames(dd)<-dd$Row.names; dd[-1]
}, my_new_list)
不会抱怨。
答案 1 :(得分:1)
以下是使用data.table
和reshape2
的解决方案:
# Load libraries
library(data.table)
library(reshape2)
# Setup new list object
my_list.2 <- vector(length(my_list), mode="list")
# Add time stamps as variable and add ID variable
for(i in 1:length(my_list)){
my_list.2[[i]] <- cbind(time=rownames(my_list[[i]]), my_list[[i]], id=rep(paste0("list_",i), id=nrow(my_list[[i]])))
}
# Collapse all lists in one data table
d.temp <- rbindlist(my_list.2)
# Transform the data
d.final <- dcast(time~id, value.var="PX_LAST", data=d.temp)
# > d.final
# time list_1 list_2 list_3 list_4
# 1 2014-04-28 00:00:00 0.398 NA 0.4810 1.6500
# 2 2014-04-29 00:00:00 0.457 1.7310 0.4560 1.6400
# 3 2014-04-30 00:00:00 0.400 1.7060 0.4480 1.6500
# 4 2014-05-02 00:00:00 0.159 1.6900 0.4360 1.6595
# 5 2014-05-05 00:00:00 0.126 1.7130 0.4480 1.6650
# 6 2014-05-06 00:00:00 0.108 1.7110 0.4580 1.6595
# 7 2014-05-07 00:00:00 0.260 1.7240 0.4660 1.6625
# 8 2014-05-08 00:00:00 0.239 1.6990 0.4320 1.6520
# 9 2014-05-09 00:00:00 0.222 1.7020 0.4370 1.6450
# 10 2014-05-12 00:00:00 0.191 1.7050 0.4410 1.6245
# 11 2014-05-13 00:00:00 0.184 1.6490 0.4170 1.6270
# 12 2014-05-01 00:00:00 NA 1.7095 0.4390 1.6620
# 13 2014-05-14 00:00:00 NA 1.6110 0.4035 1.6330
# 14 2014-04-25 00:00:00 NA NA NA 1.6500