Question

我想首先通过组计算选择切换概率（user在下面的代码中）。然后我将平均组级概率并获得总概率。我有成千上万的组，所以我需要快速的代码。我的代码是for loop，运行时间超过10分钟。我做了相同的代码/逻辑excel，只需不到几秒钟。

特定用户的switching选择m to n被定义为选择n at period t和m at period t-1的观察的份额我的原始代码首先通过for循环标记第一次和最后一次购买。然后使用另一个for循环来获得切换矩阵。我仅能够通过整个数据而不是按组创建切换矩阵。即便如此，它仍然很慢。添加用户会使它更慢。

t<-c(1,2,1,1,2,3,4,5) user<-c('A','A','B' ,'C','C','C','C','C') choice<-c(1,1,2,1,2,1,3,3) dt<-data.frame(t,user,choice) t user choice 1 A 1 2 A 1 1 B 2 1 C 1 2 C 2 3 C 1 4 C 3 5 C 3 # **step one** create a second choice column for later construction of the switching matrix #Label first purchase and last purchase is zero for (i in 1:nrow(dt)) { ifelse (dt$user[i+1]==dt$user[i],dt$newcol[i+1]<-0,dt$newcol[i+1]<-1) } # **step two** create stitching matrix # switching.m is a empty matrix with the size of total chocie:3x3 here length(unique(dt$user)) total.choice<-3 switching.m<-matrix(0,nrow=total.choice,ncol=total.choice) for (i in 1:total.choice) { for(j in 1:total.choice) { if(length(nrow(switching.m[switching.m[,1]==i& switching.m[,2]==j,])!=0)) {switching.m[i,j]=nrow(dt[dt[,1]==i&dt[,2]==j,])} else {switching.m[i,j]<0} } }

特定用户/组的愿望输出是这样的。即使用户没有做出特定的选择
，输出也应具有相同的矩阵大小
# take user C #output for switching matrix second choice first 1 2 3 1 0 1 1 2 1 0 0 3 0 0 1 #output for switching probability second choice first 1 2 3 1 0 0.5 0.5 2 1 0 0 3 0 0 1

Answer 1

我们可以在{用户'table之后使用prop.table和split

lst <- lapply(split(dt, dt$user), function(x)
     table(factor(x$choice, levels= 1:3), factor(c(x$choice[-1], NA), levels=1:3)))

正如@nicola所提到的那样，'{user}

对split'选择'列更为紧凑

lst <- lapply(split(dt$choice, dt$user), function(x) 
       table(factor(x, levels = 1:3), factor(c(x[-1], NA), levels = 1:3))) 

lst$C

#  1 2 3
#1 0 1 1
#2 1 0 0
#3 0 0 1


prb <- lapply(lst, prop.table, 1)
prb$C

#     1   2   3
#  1 0.0 0.5 0.5
#  2 1.0 0.0 0.0
#  3 0.0 0.0 1.0

对于许多组，按组生成选择切换矩阵

1 个答案: