使用aprof包分析Monty Hall代码

时间:2016-01-31 20:26:54

标签: r performance memory-management

我为Monty Hall问题编写了R代码。就我所知,代码可以正常工作。但是,我也在使用aprof包来尝试降低代码的速度和内存要求。我能够将速度降低50%,但我无法使aprof包的内存分析功能正常工作。感谢您在解决此错误方面的任何建议或帮助。

首先我描述蒙蒂霍尔问题:

向选手展示三扇门。一扇门隐藏了一个很好的奖品。两扇门隐藏了一个不好的奖品。参赛者不知道每扇门背后是什么。参赛者选择一扇门。主持人打开未被选手选中的两扇门之一。主人没有打开门隐藏好价钱。接下来,如果参赛者希望保持最初选择的门或者切换到未由参赛者最初选择的一个剩余的关闭的门,则主持人要求参赛者。参赛者应该怎么做?答:选手应该总是切换门。这是因为参赛者最初有33%的机会获胜,67%的机会输掉。转换门将获胜概率提高到67%。

以下是R代码,我认为该代码有效。

library(aprof)
set.seed(1234)
foo <- function(N) {

     game.interations  <- 10000
     contestant.action <- rep(NA, game.interations)
     game.result       <- rep('lose', game.interations)

     for(i in 1:game.interations) {

          door <- c(0,0,0)
          door[sample(3, 1)] = 1            # assign nice prize to a door
                                            # door  with '1' has  nice prize
                                            # doors with '0' have bad  prize
          initial.pick <- sample(3, 1)      # initial contestant action
          not.picked   <- c(1:3)[-initial.pick]
          door.opened.by.host <- not.picked[1]
          if(door[initial.pick   ]==1) door.opened.by.host = not.picked[sample(2,1)]
          if(door[  not.picked[1]]==1) door.opened.by.host = not.picked[2]
          contestant.action[i] <- sample(c('k', 's'), 1)
          second.pick <- ifelse(contestant.action[i] == 'k', initial.pick, 
                         not.picked[which(not.picked!=door.opened.by.host)])
          if(door[second.pick]==1) game.result[i] = 'win'
     }

x <- table(contestant.action , game.result)         # examine probability of 
                                                    # winning by action 
prop.table(x)

}

foo(N)

#                      game.result
# contestant.action   lose    win
#          k (keep)   0.3293 0.1613
#          s (switch) 0.1705 0.3389

这是aprof代码开始的地方。从这一点开始,代码取自包文档。本节中的代码似乎也能正常工作,并确定函数foo的每一行所需的时间。

## save function to a source file and reload
dump("foo",file="foo.R")
source("foo.R")

## create file to save profiler output
tmp<-tempfile()

## Profile the function
Rprof(tmp,line.profiling=TRUE)
foo(1e4)
Rprof(append=FALSE)

## Create a aprof object
fooaprof<-aprof("foo.R",tmp)

## display basic information, summarize and plot the object
fooaprof
summary(fooaprof)
plot(fooaprof)

# another plot
profileplot(fooaprof)

这是内存分析代码开始的地方。返回错误的行标识如下。

## To continue with memory profiling:
## enable memory.profiling=TRUE
Rprof(tmp,line.profiling=TRUE,memory.profiling=TRUE)
foo(1e4)
Rprof(append=FALSE)

#
# This line returns the error message below
#
## Create a aprof object
fooaprof<-aprof("foo.R",tmp)
#
# Error in `colnames<-`(`*tmp*`, value = c("sm_v_heap", "lrg_v_heap", "mem_in_node" : 
# 'names' attribute [3] must be the same length as the vector [0]
#

## display basic information, and plot memory usage
fooaprof
plot(fooaprof)

以下是我认为aprof包在返回错误时尝试读取的文件内容,但我不确定。请注意,这是一个隐藏文件:

memory profiling: line profiling: sample.interval=20000
#File 1: foo.R
:153316:554084:15881544:162:1#22 "foo" 
:150595:494927:15084104:869:1#26 "foo" 
:149818:473956:14839440:908:1#12 "sample" 1#12 "foo" 
:147827:430136:14250768:879:"sample" 1#16 "foo" 
:154551:576315:16254896:864:1#24 "foo" 
:151678:512463:15404032:896:"is.numeric" "sample" 1#12 "foo" 
:150598:488049:15083040:929:"length" "sample" 1#12 "foo" 
:146904:403852:13989752:857:"sample.int" "sample" 1#12 "foo" 
:146035:384446:13729968:919:"sample" 1#24 "foo" 
:156862:629525:16944760:955:"sample.int" "sample" 1#24 "foo" 
:154543:577567:16250584:905:1#24 "foo" 
:150690:595793:15020264:942:

1 个答案:

答案 0 :(得分:2)

这是由于一个错误(不完整的正则表达式语法)。感谢您报告此事。该错误已在版本0.3.2中修复。可从wildcard获取。它很快就会上传到CRAN。