Question

目标是用评级数据集中的计数计算填充上三角矩阵。通过查找正确的索引来计算和存储每个值。它不是按顺序存储的。下面的R代码工作正常，但是对于大型数据集需要花费太多时间。

ratings <- read.csv("ratings.csv", header=TRUE, sep=",")    
>> head(ratings)
  userId movieId rating  timestamp
1      1      16    4.0 1217897793
2      1      24    1.5 1217895807
3      1      32    4.0 1217896246
4      1      47    4.0 1217896556
5      1      50    4.0 1217896523
6      1     110    4.0 1217896150

no_nodes <- nrow(movies)*2
temp <- movies$movieId
nodes_name <- c(paste(temp,"-L",sep=""),paste(temp,"-D",sep=""))

ac_graph <- matrix(NA,nrow=length(nodes_name),ncol=length(nodes_name),dimnames = list(nodes_name,nodes_name))
for(i in 1:nrow(movies)){
  for(j in (i+1):nrow(movies)){
    ac_graph[which(nodes_name==paste(i,"-L",sep="")),which(nodes_name==paste(j,"-L",sep=""))] <- length(intersect(ratings[ratings$movieId==i&ratings$rating>2.5,1],ratings[ratings$movieId==j&ratings$rating>2.5,1]))
    ac_graph[which(nodes_name==paste(i,"-D",sep="")),which(nodes_name==paste(j,"-D",sep=""))] <- length(intersect(ratings[ratings$movieId==i&ratings$rating<=2.5,1],ratings[ratings$movieId==j&ratings$rating<=2.5,1]))
  }
}

是否可以使用apply，sapply，outer或某些函数来执行相同操作？

在R中填充没有环的上三角矩阵

0 个答案: