我有n
个输入数据框,每个数据框都有一个TimeStamp
列+ k
个数值列。
我想将它们转换为k
输出数据框,每个数据框都有一个TimeStamp
列+ n
个数值列,以便输出的数字列i
dataframe j
将包含输入数据框j
的数字列i
中的值(列索引排除TimeStamp
列,这是第一列)和缺少的{{1} s应该填充NA。
这些数据框中的第一列始终是TimeStamp
列(TimeStamp
s重叠的位置),
输入数据帧中的行数不同(可能有不同的TimeStamp
)。
例如,TimeStamp
的每个数据框d1, d2
都具有以下结构(n=2
下面显示了一个示例数据框d1
,k=4
可以是任意的,但对于每个数据帧都是相同的)并且每个都存储在单独的csv文件中:
k
现在我想要d1 <- structure(list(TimeStamp = structure(1:6, .Label = c("2016-12-20 10:17:20", "2016-12-20 10:19:20", "2016-12-20 10:19:40", "2016-12-20 10:20:00", "2016-12-20 10:20:20", "2016-12-20 10:20:40", "2016-12-20 10:21:00",
"2016-12-20 10:21:20", "2016-12-20 10:21:40", "2016-12-20 10:22:00",
"2016-12-20 10:22:20", "2016-12-20 10:22:40", "2016-12-20 10:23:00",
"2016-12-20 10:23:20", "2016-12-20 10:23:40", "2016-12-20 10:24:00",
"2016-12-20 10:24:20", "2016-12-20 10:24:40", "2016-12-20 10:25:00",
"2016-12-20 10:25:20", "2016-12-20 10:25:40", "2016-12-20 10:26:00",
"2016-12-20 10:26:20", "2016-12-20 10:26:40", "2016-12-20 10:27:00",
"2016-12-20 10:27:20", "2016-12-20 10:27:40", "2016-12-20 10:28:00",
"2016-12-20 10:28:20", "2016-12-20 10:28:40", "2016-12-20 10:29:00",
"2016-12-20 10:29:20", "2016-12-20 10:29:40", "2016-12-20 10:30:00",
"2016-12-20 10:30:20", "2016-12-20 10:30:40", "2016-12-20 10:31:00",
"2016-12-20 10:31:20", "2016-12-20 10:31:40", "2016-12-20 10:32:00",
"2016-12-20 10:32:20", "2016-12-20 10:32:40", "2016-12-20 10:33:00",
"2016-12-20 10:33:20", "2016-12-20 10:33:40", "2016-12-20 10:34:00",
"2016-12-20 10:34:20", "2016-12-20 10:34:40", "2016-12-20 10:35:00",
"2016-12-20 10:35:20", "2016-12-20 10:35:40", "2016-12-20 10:36:00",
"2016-12-20 10:37:00", "2016-12-20 10:37:20", "2016-12-20 10:37:40",
"2016-12-20 10:38:00", "2016-12-20 10:38:20", "2016-12-20 10:40:40",
"2016-12-20 10:41:20", "2016-12-20 10:41:40", "2016-12-20 10:44:20",
"2016-12-20 10:44:40", "2016-12-20 10:46:00", "2016-12-20 10:49:40",
"2016-12-20 10:50:00", "2016-12-20 10:50:20", "2016-12-20 10:55:00",
"2016-12-20 10:56:00", "2016-12-20 10:57:20", "2016-12-20 10:59:20",
"2016-12-20 10:59:40", "2016-12-20 11:00:20", "2016-12-20 11:01:20",
"2016-12-20 11:05:40", "2016-12-20 11:06:00", "2016-12-20 11:07:20",
"2016-12-20 11:08:20", "2016-12-20 11:08:40", "2016-12-20 11:11:40",
"2016-12-20 11:12:00", "2016-12-20 11:14:20", "2016-12-20 11:14:40",
"2016-12-20 11:15:00", "2016-12-20 11:15:20", "2016-12-20 11:15:40",
"2016-12-20 11:16:00", "2016-12-20 11:16:20", "2016-12-20 11:18:20",
"2016-12-20 11:18:40", "2016-12-20 11:19:00", "2016-12-20 11:19:20",
"2016-12-20 11:19:40", "2016-12-20 11:21:20", "2016-12-20 11:21:40",
"2016-12-20 11:22:20", "2016-12-20 11:22:40", "2016-12-20 11:23:00",
"2016-12-20 11:23:20", "2016-12-20 11:25:00", "2016-12-20 11:25:20",
"2016-12-20 11:26:00", "2016-12-20 11:26:40", "2016-12-20 11:27:00",
"2016-12-20 11:27:20", "2016-12-20 11:27:40", "2016-12-20 11:28:00",
"2016-12-20 11:28:20", "2016-12-20 11:28:40", "2016-12-20 11:34:40",
"2016-12-20 11:36:20", "2016-12-20 11:36:40", "2016-12-20 11:41:00",
"2016-12-20 11:41:20", "2016-12-20 11:42:20", "2016-12-20 11:42:40",
"2016-12-20 11:46:40", "2016-12-20 11:47:00", "2016-12-20 11:47:20",
"2016-12-20 11:47:40", "2016-12-20 11:48:00", "2016-12-20 11:48:20",
"2016-12-20 11:48:40", "2016-12-20 11:54:00", "2016-12-20 11:54:20",
"2016-12-20 11:57:40", "2016-12-20 12:00:00", "2016-12-20 12:00:40",
"2016-12-20 12:01:00", "2016-12-20 12:01:20", "2016-12-20 12:01:40",
"2016-12-20 12:02:20", "2016-12-20 12:02:40", "2016-12-20 12:03:00",
"2016-12-20 12:03:20", "2016-12-20 12:03:40", "2016-12-20 12:07:00",
"2016-12-20 12:07:20", "2016-12-20 12:07:40", "2016-12-20 12:08:00",
"2016-12-20 12:08:20", "2016-12-20 12:10:20", "2016-12-20 12:10:40"
), class = "factor"), b1 = c(-76L, 0L, 0L, -76L, -80L, -81L),
b2 = c(0L, -74L, -79L, -73L, -79L, -77L), b3 = c(0L, 0L,
-88L, -88L, -91L, 0L), b4 = c(0L, 0L, 0L, -78L, -80L, -78L
)), .Names = c("TimeStamp", "b1", "b2", "b3", "b4"), row.names = c(NA,
6L), class = "data.frame")
head(d1)
# TimeStamp b1 b2 b3 b4
#1 2016-12-20 10:17:20 -76 0 0 0
#2 2016-12-20 10:19:20 0 -74 0 0
#3 2016-12-20 10:19:40 0 -79 -88 0
#4 2016-12-20 10:20:00 -76 -73 -88 -78
#5 2016-12-20 10:20:20 -80 -79 -91 -80
#6 2016-12-20 10:20:40 -81 -77 0 -78
d2 <- structure(list(TimeStamp = structure(137:142, .Label = c("2016-12-20 10:17:20",
"2016-12-20 10:19:20", "2016-12-20 10:19:40", "2016-12-20 10:20:00",
"2016-12-20 10:20:20", "2016-12-20 10:20:40", "2016-12-20 10:21:00",
"2016-12-20 10:21:20", "2016-12-20 10:21:40", "2016-12-20 10:22:00",
"2016-12-20 10:22:20", "2016-12-20 10:22:40", "2016-12-20 10:23:00",
"2016-12-20 10:23:20", "2016-12-20 10:23:40", "2016-12-20 10:24:00",
"2016-12-20 10:24:20", "2016-12-20 10:24:40", "2016-12-20 10:25:00",
"2016-12-20 10:25:20", "2016-12-20 10:25:40", "2016-12-20 10:26:00",
"2016-12-20 10:26:20", "2016-12-20 10:26:40", "2016-12-20 10:27:00",
"2016-12-20 10:27:20", "2016-12-20 10:27:40", "2016-12-20 10:28:00",
"2016-12-20 10:28:20", "2016-12-20 10:28:40", "2016-12-20 10:29:00",
"2016-12-20 10:29:20", "2016-12-20 10:29:40", "2016-12-20 10:30:00",
"2016-12-20 10:30:20", "2016-12-20 10:30:40", "2016-12-20 10:31:00",
"2016-12-20 10:31:20", "2016-12-20 10:31:40", "2016-12-20 10:32:00",
"2016-12-20 10:32:20", "2016-12-20 10:32:40", "2016-12-20 10:33:00",
"2016-12-20 10:33:20", "2016-12-20 10:33:40", "2016-12-20 10:34:00",
"2016-12-20 10:34:20", "2016-12-20 10:34:40", "2016-12-20 10:35:00",
"2016-12-20 10:35:20", "2016-12-20 10:35:40", "2016-12-20 10:36:00",
"2016-12-20 10:37:00", "2016-12-20 10:37:20", "2016-12-20 10:37:40",
"2016-12-20 10:38:00", "2016-12-20 10:38:20", "2016-12-20 10:40:40",
"2016-12-20 10:41:20", "2016-12-20 10:41:40", "2016-12-20 10:44:20",
"2016-12-20 10:44:40", "2016-12-20 10:46:00", "2016-12-20 10:49:40",
"2016-12-20 10:50:00", "2016-12-20 10:50:20", "2016-12-20 10:55:00",
"2016-12-20 10:56:00", "2016-12-20 10:57:20", "2016-12-20 10:59:20",
"2016-12-20 10:59:40", "2016-12-20 11:00:20", "2016-12-20 11:01:20",
"2016-12-20 11:05:40", "2016-12-20 11:06:00", "2016-12-20 11:07:20",
"2016-12-20 11:08:20", "2016-12-20 11:08:40", "2016-12-20 11:11:40",
"2016-12-20 11:12:00", "2016-12-20 11:14:20", "2016-12-20 11:14:40",
"2016-12-20 11:15:00", "2016-12-20 11:15:20", "2016-12-20 11:15:40",
"2016-12-20 11:16:00", "2016-12-20 11:16:20", "2016-12-20 11:18:20",
"2016-12-20 11:18:40", "2016-12-20 11:19:00", "2016-12-20 11:19:20",
"2016-12-20 11:19:40", "2016-12-20 11:21:20", "2016-12-20 11:21:40",
"2016-12-20 11:22:20", "2016-12-20 11:22:40", "2016-12-20 11:23:00",
"2016-12-20 11:23:20", "2016-12-20 11:25:00", "2016-12-20 11:25:20",
"2016-12-20 11:26:00", "2016-12-20 11:26:40", "2016-12-20 11:27:00",
"2016-12-20 11:27:20", "2016-12-20 11:27:40", "2016-12-20 11:28:00",
"2016-12-20 11:28:20", "2016-12-20 11:28:40", "2016-12-20 11:34:40",
"2016-12-20 11:36:20", "2016-12-20 11:36:40", "2016-12-20 11:41:00",
"2016-12-20 11:41:20", "2016-12-20 11:42:20", "2016-12-20 11:42:40",
"2016-12-20 11:46:40", "2016-12-20 11:47:00", "2016-12-20 11:47:20",
"2016-12-20 11:47:40", "2016-12-20 11:48:00", "2016-12-20 11:48:20",
"2016-12-20 11:48:40", "2016-12-20 11:54:00", "2016-12-20 11:54:20",
"2016-12-20 11:57:40", "2016-12-20 12:00:00", "2016-12-20 12:00:40",
"2016-12-20 12:01:00", "2016-12-20 12:01:20", "2016-12-20 12:01:40",
"2016-12-20 12:02:20", "2016-12-20 12:02:40", "2016-12-20 12:03:00",
"2016-12-20 12:03:20", "2016-12-20 12:03:40", "2016-12-20 12:07:00",
"2016-12-20 12:07:20", "2016-12-20 12:07:40", "2016-12-20 12:08:00",
"2016-12-20 12:08:20", "2016-12-20 12:10:20", "2016-12-20 12:10:40"
), class = "factor"), b1 = c(-76L, 0L, 0L, 0L, -82L, -74L), b2 = c(-87L,
-76L, 0L, 0L, 0L, -69L), b3 = c(0L, 0L, -84L, -84L, 0L, -85L),
b4 = c(-75L, 0L, 0L, 0L, 0L, 0L)), .Names = c("TimeStamp",
"b1", "b2", "b3", "b4"), row.names = c(NA, 6L), class = "data.frame")
head(d2)
# TimeStamp b1 b2 b3 b4
# 1 2016-12-20 12:07:20 -76 -87 0 -75
# 2 2016-12-20 12:07:40 0 -76 0 0
# 3 2016-12-20 12:08:00 0 0 -84 0
# 4 2016-12-20 12:08:20 0 0 -84 0
# 5 2016-12-20 12:10:20 -82 0 0 0
# 6 2016-12-20 12:10:40 -74 -69 -85 0
个数据框,每个数据框都有k
列(要保存为单独的csv文件)。例如,我想从上面的输入数据框n
获得以下输出数据帧b1, b2, b3, b4
(其中两个显示),如下所示:
d1, d2
给定示例中来自不同数据帧的时间戳是不相交的,但是来自不同数据帧的时间戳一般将重叠,在后一种情况下我们不需要填充NA(因为数值)将出席)。
最简单,最有效和最通用的方法是什么(使用 b1
# TimeStamp d1 d2
#2016-12-20 10:17:20 -76 NA
#2016-12-20 10:19:20 0 NA
#2016-12-20 10:19:40 0 NA
#2016-12-20 10:20:00 -76 NA
#2016-12-20 10:20:20 -80 NA
#2016-12-20 10:20:40 -81 NA
#2016-12-20 12:07:20 NA -76
#2016-12-20 12:07:40 NA 0
#2016-12-20 12:08:00 NA 0
#2016-12-20 12:08:20 NA 0
#2016-12-20 12:10:20 NA -82
#2016-12-20 12:10:40 NA -74
b2
# TimeStamp d1 d2
#2016-12-20 10:17:20 0 NA
#2016-12-20 10:19:20 -74 NA
#2016-12-20 10:19:40 -79 NA
#2016-12-20 10:20:00 -73 NA
#2016-12-20 10:20:20 -79 NA
#2016-12-20 10:20:40 -77 NA
#2016-12-20 12:07:20 NA -87
#2016-12-20 12:07:40 NA -76
#2016-12-20 12:08:00 NA 0
#2016-12-20 12:08:20 NA 0
#2016-12-20 12:10:20 NA 0
#2016-12-20 12:10:40 NA -69
,最好没有循环)?我可以使常量base R / dplyr / tidyr / data.table
和n
以及数据帧任意大。
答案 0 :(得分:1)
也许你可以试试这个:
#read d1 data from PATH1
d1_df <- read.table("PATH1", header = T, sep = "\t", stringsAsFactors = F)
#store d1 colnames
d1_colname <- colnames(d1_df)[-1]
#read d2 data from PATH2
d2_df <- read.table("PATH2", header = T, sep = "\t", stringsAsFactors = F)
#store d2 colnames
d2_colname <- colnames(d2_df)[-1]
#merge two df timestamp
TimeStamp <-c(unlist(d1[,1]), unlist(d2[,1]))
#merge two df colname
merge_colname <- rbind(d1_colname, d2_colname)
#to match the format want
merge_df <- function(vec_colname){
d1 <- c(unlist(d1_df[, vec_colname[1]]), rep("NA", nrow(d2_df)))
d2 <- c(rep("NA", nrow(d1_df)), unlist(d2_df[, vec_colname[2]]))
return( data.frame(TimeStamp, d1, d2))
}
#get result,but is a list
res_list <- apply(merge_colname, 2, merge_df)
#create data frames from the result
for(i in 1:length(res_list)){
#bi <- res_list[[i]]
eval(parse(text=paste0("b",i,"<-res_list[[",i,"]]")))
}
结果:
> b1
TimeStamp d1 d2
1 2016-12-20 10:17:20 -76 NA
2 2016-12-20 10:19:20 0 NA
3 2016-12-20 10:19:40 0 NA
4 2016-12-20 10:20:00 -76 NA
5 2016-12-20 10:20:20 -80 NA
6 2016-12-20 10:20:40 -81 NA
7 2016-12-20 12:07:20 NA -76
8 2016-12-20 12:07:40 NA 0
9 2016-12-20 12:08:00 NA 0
10 2016-12-20 12:08:20 NA 0
11 2016-12-20 12:10:20 NA -82
12 2016-12-20 12:10:40 NA -74