见下表:
X =
col1 col2 col3
row1 "A" "A" "1.0"
row2 "A" "B" "0.9"
row3 "A" "C" "0.4"
row4 "B" "A" "0.9"
row5 "B" "B" "1.0"
row6 "B" "C" "0.2"
row7 "C" "A" "0.4"
row8 "C" "B" "0.2"
row9 "C" "C" "1.0"
其中col3是col1和col2中实体对之间的相关度量。
如何构造一个列名为col1的矩阵,行名称为col2,矩阵单元格中的值是否由col3填充?
答案 0 :(得分:3)
df <- read.table(textConnection('col1 col2 col3
row1 "A" "A" "1.0"
row2 "A" "B" "0.9"
row3 "A" "C" "0.4"
row4 "B" "A" "0.9"
row5 "B" "B" "1.0"
row6 "B" "C" "0.2"
row7 "C" "A" "0.4"
row8 "C" "B" "0.2"
row9 "C" "C" "1.0"'), header=T)
## fetch row/column indices
rows <- match(df$col1, LETTERS)
cols <- match(df$col2, LETTERS)
## create matrix
m <- matrix(0, nrow=max(rows), ncol=max(cols))
## fill matrix
m[cbind(rows, cols)] <- df$col3
m
# [,1] [,2] [,3]
#[1,] 1.0 0.9 0.4
#[2,] 0.9 1.0 0.2
#[3,] 0.4 0.2 1.0
答案 1 :(得分:3)
需要一些数据才能使用,所以我会做一些。
# Make fake data
x <- c('A','B','C')
dat <- expand.grid(x, x)
dat$Var3 <- rnorm(9)
我们可以使用基数R来做到这一点。我对“重塑”功能不太满意,但你可以做到这一点。之后需要清除列名称
> reshape(dat, idvar = "Var1", timevar = "Var2", direction = "wide")
Var1 Var3.A Var3.B Var3.C
1 A -1.2442937 -0.01132871 -0.5693153
2 B -1.6044295 -1.34907504 1.6778866
3 C 0.5393472 -1.00637345 -0.7694940
或者,您可以使用reshape2包中的dcast
函数。我认为输出更清洁。
> library(reshape2)
> dcast(dat, Var1 ~ Var2, value.var = "Var3")
Var1 A B C
1 A -1.2442937 -0.01132871 -0.5693153
2 B -1.6044295 -1.34907504 1.6778866
3 C 0.5393472 -1.00637345 -0.7694940