如何更快地制作以下代码。到目前为止,P = 1(即一个循环)的整个过程大约需要15分钟。我知道问题应该是For
循环,我已经阅读了几个相关的问题,但我无法理解它们是如何工作的。
在以下脚本中:P和R大约为1000,TOLTarget和TOLSource最多可以为500.
任何帮助都会感激不尽。
for(i in 1:P)
{
Source <- MITLinks[i,1]
Target <- MITLinks[i,2]
TOLTarget <- sum(!is.na(MITMatrix[Target,]))-1 # TOLTarget would be the number of concepts for the target course
TOLSource <- sum(!is.na(MITMatrix[Source,]))-1
for(q in 2:TOLSource) # since the first coulmn is the courseID
{
DD <- vector(length = R)
ConceptIDSource <- MITMatrix[Source,q]
counterq <- 1 # counterq is a pointer to cell of vector DD that keep the corses from another university.
for(c in 1:R)
{
if(CALBinary[c,match(ConceptIDSource,BB)]==1) # if(CALBinary[c,"ConceptIDSource"]==1)
{
DD[counterq] <- c # it is the courseID
counterq <- counterq+1
}
}
DD <- DD[ DD != 0 ] # DD is a vector that keep all courses from another university hat share the same concepts as source course in the first university (MIT)
for(j in 2:TOLTarget) # Since the first coulmn is the courseID
{
ZZ <- vector(length = R)
ConceptIDTarget <- MITMatrix[Target,j]
counter <- 1
for(v in 1:R)
{
if(CALBinary[v,match(ConceptIDTarget,BB)]==1) #if(CALBinary[v,"ConceptIDTarget"]==1)
{
ZZ[counter] <- v # v is courseID
counter <- counter+1
}
}
ZZ <- ZZ[ ZZ != 0 ] # delete the zero elements from the vector
Jadval<- expand.grid(Source,Target,ConceptIDSource,ConceptIDTarget,DD,ZZ)
Total<-rbind(Total,Jadval) # to make all possible pair of the courses for the sorce and the target course
Total
}
}
}
答案 0 :(得分:1)
有许多方面可以改进此代码并使其更快。看起来你基本上是编写C风格的代码,而不是利用内置的矢量化R函数。这是一个例子。这部分代码:
DD <- vector(length = R)
ConceptIDSource <- MITMatrix[Source,q]
counterq <- 1 # counterq is a pointer to cell of vector DD that keep the corses from another university.
for(c in 1:R)
{
if(CALBinary[c,match(ConceptIDSource,BB)]==1) # if(CALBinary[c,"ConceptIDSource"]==1)
{
DD[counterq] <- c # it is the courseID
counterq <- counterq+1
}
}
DD <- DD[ DD != 0 ]
可以这样做:
ConceptIDSource <- MITMatrix[Source,q]
CalBinaryBB <- CALBinary[,match(ConceptIDSource,BB)]
DD<-which(CalBinaryBB[1:R]==1)
在你的代码中,每次循环都会调用match
,这是不必要的。而且,由于您正在尝试找到CALBinary[c,match(ConceptIDSource,BB)]==1
的索引,因此R函数which
将更快地执行此操作。
看起来你可以在循环的第二部分做同样的事情。并且可能还有其他优化机会。