我有以下R数据帧:
DF1
a b c d
2 0.671 0.105 0.181 0.241
3 0.446 -0.243 0.051 1.577
5 0.624 0.075 -0.451 -0.212
和DF2
a b c d
2 3.672 7.204 -0.164 3.251
3 4.445 -0.242 0.025 1.627
5 2.621 0.375 -0.468 -4.762
两个数据框都具有相同的尺寸。 我想通过它们在df中的索引位置来组合它们,因此最终结果产生12个向量(或12个1维df),每个向量名称将反映它绘制其值的索引。
例如,结果将是:
a2(0.671,3.672)
b2(0.105,7.204)
...
d5(-0.212,-4.762)
谢谢!
答案 0 :(得分:2)
我们可以使用base R
lst <- Map(`c`, t(DF1), t(DF2))
names(lst) <- do.call(paste0, expand.grid(dimnames(t(DF1))))
答案 1 :(得分:1)
看到你计划在最后做do.call(cbind, ...)
,也许你应该考虑采用不同的方法。您可以轻松创建如下函数:
combineTranspose <- function(...) {
temp <- list(...)
rbindlist(lapply(temp, function(x) {
melt(as.data.table(x, keep.rownames = TRUE), "rn")
}))[, dcast(.SD, rowid(variable, rn) ~ paste0(variable, rn),
value.var = "value")]
}
该函数将可变数量的data.frame
s作为输入。它将它们转换为data.table
s,将rownames
添加为变量,rbind
将它们组合在一起,然后将数据重新整形为宽格式。
这里的一个优点是输入中的列和行的顺序 - 甚至输入中存在相同的列和行 - 都无关紧要。这是一个简单的例子。
set.seed(1)
df1 <- data.frame(a = runif(3), b = runif(3), c = runif(3),
d = runif(3), row.names = c(1, 2, 3))
df2 <- data.frame(a = runif(3), b = runif(3), c = runif(3),
d = runif(3), row.names = c(1, 3, 4))
df3 <- data.frame(a = runif(3), b = runif(3), c = runif(3),
d = runif(3), row.names = c(4, 2, 3))
combineTranspose(df1, df2, df3)
## variable a1 a2 a3 a4 b1 b2 b3
## 1: 1 0.2655087 0.3721239 0.57285336 0.7698414 0.9082078 0.2016819 0.8983897
## 2: 2 0.6870228 0.3861141 0.38410372 0.2672207 0.4976992 0.8696908 0.7176185
## 3: 3 NA NA 0.01339033 NA NA NA 0.3403490
## b4 c1 c2 c3 c4 d1 d2 d3
## 1: 0.9919061 0.9446753 0.6607978 0.6291140 0.9347052 0.06178627 0.2059746 0.1765568
## 2: 0.3823880 0.3800352 0.5995658 0.7774452 0.4820801 0.21214252 0.8273733 0.6516738
## 3: NA NA NA 0.4935413 NA NA NA 0.6684667
## d4
## 1: 0.1255551
## 2: 0.1862176
## 3: NA
以下是输入数据的功能:
DF1 <- structure(list(a = c(0.671, 0.446, 0.624), b = c(0.105, -0.243, 0.075),
c = c(0.181, 0.051, -0.451), d = c(0.241, 1.577, -0.212)),
.Names = c("a", "b", "c", "d"), row.names = c("2", "3", "5"), class = "data.frame")
DF2 <- structure(list(a = c(3.672, 4.445, 2.621), b = c(7.204, -0.242, 0.375),
c = c(-0.164, 0.025, -0.468), d = c(3.251, 1.627, -4.762)),
.Names = c("a", "b", "c", "d"), row.names = c("2", "3", "5"), class = "data.frame")
combineTranspose(DF1, DF2)
## variable a2 a3 a5 b2 b3 b5 c2 c3 c5 d2 d3 d5
## 1: 1 0.671 0.446 0.624 0.105 -0.243 0.075 0.181 0.051 -0.451 0.241 1.577 -0.212
## 2: 2 3.672 4.445 2.621 7.204 -0.242 0.375 -0.164 0.025 -0.468 3.251 1.627 -4.762
答案 2 :(得分:0)
这样做你想要的吗?
# sample data
df1 = read.table(text=" a b c d
2 0.671 0.105 0.181 0.241
3 0.446 -0.243 0.051 1.577
5 0.624 0.075 -0.451 -0.212" ,header=T)
df2 = read.table(text=" a b c d
2 3.672 7.204 -0.164 3.251
3 4.445 -0.242 0.025 1.627
5 2.621 0.375 -0.468 -4.762" ,header=T)
# reshaping the dataframe
library(reshape2)
library(dplyr)
df1$rowid = seq(nrow(df1))
df2$rowid = seq(nrow(df2))
df1 = melt(df1, id.vars=c("rowid"))
df2 = melt(df2, id.vars=c("rowid"))
df1 = df1 %>% full_join(df2,by=c('rowid','variable'))
输出:
rowid variable value.x value.y
1 2 a 0.671 3.672
2 3 a 0.446 4.445
3 5 a 0.624 2.621
4 2 b 0.105 7.204
5 3 b -0.243 -0.242
6 5 b 0.075 0.375
7 2 c 0.181 -0.164
8 3 c 0.051 0.025
9 5 c -0.451 -0.468
10 2 d 0.241 3.251
11 3 d 1.577 1.627
12 5 d -0.212 -4.762
或者,如果你想要一维df的列表:
y = split(df1[,c('value.x','value.y')],seq(nrow(df1)))
names(y) = paste0(df1$variable,df1$rowid)
输出:
$a2
value.x value.y
1 0.671 3.672
$a3
value.x value.y
2 0.446 4.445
$a5
value.x value.y
3 0.624 2.621
$b2
value.x value.y
4 0.105 7.204
$b3
value.x value.y
5 -0.243 -0.242
$b5
value.x value.y
6 0.075 0.375
$c2
value.x value.y
7 0.181 -0.164
$c3
value.x value.y
8 0.051 0.025
$c5
value.x value.y
9 -0.451 -0.468
$d2
value.x value.y
10 0.241 3.251
$d3
value.x value.y
11 1.577 1.627
$d5
value.x value.y
12 -0.212 -4.762
希望这有帮助!