I am trying to combine the parallelizing feature of plyr
to call a python function via reticulate
but there seems to be an identical seed used on the different instances.
on python:
# This is called python_script.py
import random
def give_a_rand():
return(random.random())
on R
library(reticulate)
library(plyr)
library(doMC)
doMC::registerDoMC(cores=10)
reticulate::source_python('/path/to/python_script.py')
After loading libraries, registering cores for plyr
and linking the python script to my R session via reticulate
we can now call the python function give_a_rand()
natively on R
> give_a_rand()
[1] 0.896585
We can use plyr to run it many times without parallelizing it:
> aaply(.data=1:10, .margins=1, .fun=function(x){give_a_rand()}, .parallel=F)
1 2 3 4 5 6
0.183420430 0.539790166 0.817348174 0.130959177 0.143210990 0.794048321
7 8 9 10
0.276724929 0.820918953 0.003462523 0.903942433
I guess that at some point I need to force the seed for the randomization engine in such a way that every instance has a different one. All is great so far ... but how to parallelize it?
aaply(.data=1:10, .margins=1, .fun=function(x){give_a_rand()}, .parallel=T)
1 2 3 4 5 6 7 8
0.896585 0.896585 0.896585 0.896585 0.896585 0.896585 0.896585 0.896585
9 10
0.896585 0.896585
答案 0 :(得分:0)
好-基于this的答案,我修改了python函数,现在可以使用了:
import random
def seed_from_urandom():
rand_int = 0
f = open("/dev/urandom","rb")
rnd_str = f.read(4)
for c in rnd_str:
rand_int <<= 8
rand_int += int(c)
return(int(rand_int))
def give_a_rand():
random.seed(seed_from_urandom())
return(random.random())