我有这个data.table:
df <- data.table(u = c(1,2,3,4,5), d = c(1,2,0,1,2), V1 = c(0.3, 0.2, 0.2, 0.1, 0.2),
pred = c(1,2,0,1,0), sec_pred = c(2,1,0,1,0), ones = rep(1,5))
# u d V1 pred sec_pred ones
#1: 1 1 0.3 1 2 1
#2: 2 2 0.2 2 1 1
#3: 3 0 0.2 0 0 1
#4: 4 1 0.1 1 1 1
#5: 5 2 0.2 0 0 1
我想得到像这样的矩阵:
dcast(df, u + d + V1 ~ pred + sec_pred, fill = 0, value.var = 'ones')
# d V1 u 0_0 1_1 1_2 2_1
#1: 0 0.2 3 1 0 0 0
#2: 1 0.1 4 0 1 0 0
#3: 1 0.3 1 0 0 1 0
#4: 2 0.2 2 0 0 0 1
#5: 2 0.2 5 1 0 0 0
但是由于我有一个非常大的data.table,我想创建一个稀疏矩阵。但是,创建pred
和sec_pred
值的所有可能组合会很棒,例如0_0,0_1,0_2,1_0,1_1 ......
答案 0 :(得分:1)
一个选项可能是
library(Matrix)
v1 <- df[, do.call(paste, c(.SD, list( sep="_"))), .SDcols = 4:5]
j1 <- match(v1, unique(v1))
sM <- sparseMatrix(1:nrow(df), j1, x=1,
dimnames = list(NULL, unique(v1)))
sM
# 5 x 4 sparse Matrix of class "dgCMatrix"
# 1_2 2_1 0_0 1_1
#[1,] 1 . . .
#[2,] . 1 . .
#[3,] . . 1 .
#[4,] . . . 1
#[5,] . . 1 .
如果我们需要order
sM[,order(colnames(sM))]
#5 x 4 sparse Matrix of class "dgCMatrix"
# 0_0 1_1 1_2 2_1
#[1,] . . 1 .
#[2,] . . . 1
#[3,] 1 . . .
#[4,] . 1 . .
#[5,] 1 . . .