我正在处理的输入示例如下:
User ID 1 --- Artist 5
User ID 2 --- Artist 1
User ID 3 --- Artist 7
User ID 4 --- Artist 2
User ID 5 --- Artist 3
User ID 1 --- Artist 2
User ID 3 --- Artist 1
以上数据是应用用户收听的音乐记录。
我想生成一个与下面给出的例子相对应的邻接矩阵:
ARTIST 1 ARTIST 2 ARTIST 3 ARTIST 4 ARTIST 5 ARTIST 6 ARTIST 7
USER ID 1 0 1 0 0 1 0 0
USER ID 2 1 0 0 0 0 0 0
USER ID 3 1 0 0 0 0 0 1
USER ID 4 0 1 0 0 0 0 0
USER ID 5 0 0 1 0 0 0 0
如何在R中实现这一点。任何提示或指示都将非常受欢迎。
提前感谢您的时间和帮助。
答案 0 :(得分:4)
如果DF
是与问题中的数据相对应的两列数据框,则:
xtabs(data = DF)
给出:
V2
V1 Artist 1 Artist 2 Artist 3 Artist 5 Artist 7
User ID 1 0 1 0 1 0
User ID 2 1 0 0 0 0
User ID 3 1 0 0 0 1
User ID 4 0 1 0 0 0
User ID 5 0 0 1 0 0
注意:我们将此用作输入:
DF <- structure(list(V1 = structure(c(1L, 2L, 3L, 4L, 5L, 1L, 3L), .Label = c("User ID 1",
"User ID 2", "User ID 3", "User ID 4", "User ID 5"), class = "factor"),
V2 = structure(c(4L, 1L, 5L, 2L, 3L, 2L, 1L), .Label = c("Artist 1",
"Artist 2", "Artist 3", "Artist 5", "Artist 7"), class = "factor")), .Names = c("V1",
"V2"), class = "data.frame", row.names = c(NA, -7L))
答案 1 :(得分:3)
这有效:
# get data in useable form
ContingencyTable <- read.table(text=gsub(pattern = " --- ", replacement = ",","User ID 1 --- Artist 5
User ID 2 --- Artist 1
User ID 3 --- Artist 7
User ID 4 --- Artist 2
User ID 5 --- Artist 3
User ID 1 --- Artist 2
User ID 3 --- Artist 1"),sep=",", stringsAsFactors = FALSE)
# add variable for match value
ContingencyTable$Val <- 1
# more or less lifted from Arun's answer linked by @Hong Ooi, above
adjMat <- reshape2::dcast(ContingencyTable, V1 ~ V2, value.var = "Val", fill=0)
rownames(adjMat) <- adjMat[,1]
adjMat <- adjMat[,2:ncol(adjMat)]
adjMat
Artist 1 Artist 2 Artist 3 Artist 5 Artist 7
User ID 1 0 1 0 1 0
User ID 2 1 0 0 0 0
User ID 3 1 0 0 0 1
User ID 4 0 1 0 0 0
User ID 5 0 0 1 0 0
答案 2 :(得分:2)
qdap package具有可以执行此操作的adjmat
功能:
dat <- read.table(text=gsub(pattern = " --- ", replacement = ",",
"User ID 1 --- Artist 5
User ID 2 --- Artist 1
User ID 3 --- Artist 7
User ID 4 --- Artist 2
User ID 5 --- Artist 3
User ID 1 --- Artist 2
User ID 3 --- Artist 1"),sep=",", stringsAsFactors = FALSE)
library(qdap)
x <- with(dat, termco(V1, V2, unique(V1)))
adjmat(x)$boolean
## > adjmat(x)$boolean
## Artist 1 Artist 2 Artist 3 Artist 5 Artist 7
## User ID 1 0 1 0 1 0
## User ID 2 1 0 0 0 0
## User ID 3 1 0 0 0 1
## User ID 4 0 1 0 0 0
## User ID 5 0 0 1 0 0
PS Tim Riffe阅读数据的好方法:)