有没有办法优化对`random.random()`的调用？

Question

假设我想使用基于球体积的蒙特卡罗模拟估算π（代码如下）。

有没有办法优化对`random.random()`的调用？

$ cat pi_estimate.py
#!/usr/bin/env python

"""
The task:
    Consider the part of the sphere x^2 + y^2 + z^2 <= 1 for which x,y,z > 0

The calculation:
    We generate n (e.g., 10,000) points (x,y,z) with 0 <= x,y,z <= 1 and using
    the formula for sphere volume: V = (4/3) * PI * r^3, we'll estimate PI.

    The cube in which the sphare resides has a volume: 8 * r^3

    Now, if we only consider the quardant where 0 <= x,y,z <= 1, then it's 
    only 1/8th of the total volumes, namely, Vs = (1/6) * PI * r^3, and 
    Vc = r^3
    As r == 1, Vs = (1/6) * PI, and Vc = 1
    Vs/Vc = (1/6) * PI
    Thus PI = 6 * Vs/Vc

    So, every point (x,y,z) with 0 <= x,y,z <= 1 which is x^2 + y^2 + z^2 <= 1 
    is added to Vs (and Vc), and if not than it is only in Vc.
"""

import random

N = 1000000
Vs_counter = 0
for i in xrange(N):
    x = random.random() 
    y = random.random()
    z = random.random()
    if (x**2 + y**2 + z**2) <= 1:
        Vs_counter += 1

pi = 6 * (1.0 * Vs_counter / N)
print "PI is estimated",pi

以下分析确实表明该脚本大部分时间都花在random.random()：

上

$ python -m cProfile pi_estimate.py
PI is estimated 3.142194
         3000049 function calls in 3.856 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 __future__.py:48(<module>)
        1    0.000    0.000    0.000    0.000 __future__.py:74(_Feature)
        7    0.000    0.000    0.000    0.000 __future__.py:75(__init__)
        1    0.007    0.007    0.007    0.007 hashlib.py:55(<module>)
        6    0.000    0.000    0.000    0.000 hashlib.py:94(__get_openssl_constructor)
        1    2.822    2.822    3.856    3.856 pi_estimate.py:22(<module>)
        1    0.000    0.000    0.003    0.003 random.py:100(seed)
        1    0.030    0.030    0.040    0.040 random.py:40(<module>)
        1    0.000    0.000    0.000    0.000 random.py:655(WichmannHill)
        1    0.000    0.000    0.000    0.000 random.py:72(Random)
        1    0.000    0.000    0.000    0.000 random.py:805(SystemRandom)
        1    0.000    0.000    0.003    0.003 random.py:91(__init__)
        1    0.000    0.000    0.000    0.000 {_hashlib.openssl_md5}
        1    0.000    0.000    0.000    0.000 {_hashlib.openssl_sha1}
        1    0.000    0.000    0.000    0.000 {_hashlib.openssl_sha224}
        1    0.000    0.000    0.000    0.000 {_hashlib.openssl_sha256}
        1    0.000    0.000    0.000    0.000 {_hashlib.openssl_sha384}
        1    0.000    0.000    0.000    0.000 {_hashlib.openssl_sha512}
        1    0.000    0.000    0.000    0.000 {binascii.hexlify}
        1    0.001    0.001    0.001    0.001 {function seed at 0xffe31e2c}
        6    0.000    0.000    0.000    0.000 {getattr}
        6    0.000    0.000    0.000    0.000 {globals}
        1    0.000    0.000    0.000    0.000 {math.exp}
        2    0.000    0.000    0.000    0.000 {math.log}
        1    0.000    0.000    0.000    0.000 {math.sqrt}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
  3000000    0.994    0.000    0.994    0.000 {method 'random' of '_random.Random' objects}
        1    0.002    0.002    0.002    0.002 {posix.urandom}

编辑（星期四，2014年11月27日20:45:26）

由于@ ch3ka检查了他的代码，并在使用本地名称作为库函数时进行了改进，因此我决定使用我的代码进行检查。
在我的代码中，没有任何改进我可以确定：

$ for i in {1..9} ; do for script in  pi_estimate.py  pi_estimate_local.py  ; do echo $script; python -m cProfile $script | grep 3000000; done; done
pi_estimate.py
  3000000    0.360    0.000    0.360    0.000 {method 'random' of '_random.Random' objects}
pi_estimate_local.py
  3000000    0.341    0.000    0.341    0.000 {method 'random' of '_random.Random' objects}
pi_estimate.py
  3000000    0.326    0.000    0.326    0.000 {method 'random' of '_random.Random' objects}
pi_estimate_local.py
  3000000    0.337    0.000    0.337    0.000 {method 'random' of '_random.Random' objects}
pi_estimate.py
  3000000    0.331    0.000    0.331    0.000 {method 'random' of '_random.Random' objects}
pi_estimate_local.py
  3000000    0.317    0.000    0.317    0.000 {method 'random' of '_random.Random' objects}
pi_estimate.py
  3000000    0.327    0.000    0.327    0.000 {method 'random' of '_random.Random' objects}
pi_estimate_local.py
  3000000    0.316    0.000    0.316    0.000 {method 'random' of '_random.Random' objects}
pi_estimate.py
  3000000    0.354    0.000    0.354    0.000 {method 'random' of '_random.Random' objects}
pi_estimate_local.py
  3000000    0.325    0.000    0.325    0.000 {method 'random' of '_random.Random' objects}
pi_estimate.py
  3000000    0.326    0.000    0.326    0.000 {method 'random' of '_random.Random' objects}
pi_estimate_local.py
  3000000    0.341    0.000    0.341    0.000 {method 'random' of '_random.Random' objects}
pi_estimate.py
  3000000    0.349    0.000    0.349    0.000 {method 'random' of '_random.Random' objects}
pi_estimate_local.py
  3000000    0.324    0.000    0.324    0.000 {method 'random' of '_random.Random' objects}
pi_estimate.py
  3000000    0.326    0.000    0.326    0.000 {method 'random' of '_random.Random' objects}
pi_estimate_local.py
  3000000    0.315    0.000    0.315    0.000 {method 'random' of '_random.Random' objects}
pi_estimate.py
  3000000    0.358    0.000    0.358    0.000 {method 'random' of '_random.Random' objects}
pi_estimate_local.py
  3000000    0.324    0.000    0.324    0.000 {method 'random' of '_random.Random' objects}

这是两个脚本（我编辑了文档字符串）：

$ cat pi_estimate.py | tail -14

import random

N = 1000000
Vs_counter = 0
for i in xrange(N):
    x = random.random() 
    y = random.random()
    z = random.random()
    if (x**2 + y**2 + z**2) <= 1:
        Vs_counter += 1

pi = 6 * (1.0 * Vs_counter / N)
print "PI is estimated",pi

$ cat pi_estimate_local.py | tail -14

from random import random as rnd

N = 1000000
Vs_counter = 0
for i in xrange(N):
    x = rnd()
    y = rnd()
    z = rnd()
    if (x**2 + y**2 + z**2) <= 1:
        Vs_counter += 1

pi = 6 * (1.0 * Vs_counter / N)
print "PI is estimated",pi

编辑（2014年11月28日星期五17:50:09）

关于@ ch3ka最后三条评论，我计算了对random（）的300k调用，并且正如@ ch3ka指出的那样，Python分析器确实呈现出误导性的图片。使用本地引用随机调用确实可以节省时间，即：

$ for i in {1..9}; do python pi_estimate.py ; done
PI is estimated 3.143076  -> execution time: 2.62900018692
PI is estimated 3.143346  -> execution time: 2.58100008965
PI is estimated 3.140286  -> execution time: 2.52200007439
PI is estimated 3.145734  -> execution time: 2.5110001564
PI is estimated 3.140898  -> execution time: 2.51300001144
PI is estimated 3.143058  -> execution time: 2.59200000763
PI is estimated 3.139332  -> execution time: 2.60400009155
PI is estimated 3.142956  -> execution time: 2.47399997711
PI is estimated 3.144552  -> execution time: 2.50100016594

$ for i in {1..9}; do python pi_estimate_local.py ; done
PI is estimated 3.146772  -> execution time: 2.22300004959
PI is estimated 3.142038  -> execution time: 2.18499994278
PI is estimated 3.139032  -> execution time: 2.14800000191
PI is estimated 3.14052  -> execution time: 2.20199990273
PI is estimated 3.141384  -> execution time: 2.25199985504
PI is estimated 3.142086  -> execution time: 2.25200009346
PI is estimated 3.137748  -> execution time: 2.18099999428
PI is estimated 3.141906  -> execution time: 2.40199995041
PI is estimated 3.138126  -> execution time: 2.16100001335

Answer 1

实际上，您可以使用一种优化技术 - 本地别名。

考虑

import random
import timeit

try: xrange # py3 compatibility
except NameError: xrange = range

def f1():
    return sum((random.random() for _ in xrange(10**5)))

def f2():
    rand = random.random # bind random.random to local var
    myrange = xrange # same for range gen (kinda pointless here, but to illustrate that you can do this with everything)
    return sum((rand() for _ in myrange(10**5)))

print(timeit.timeit(f1, number=100))
print(timeit.timeit(f2, number=100))

f1和f2做同样的事，对吧？除了f2以外，在本地命名空间中查找范围和rand函数的优势除外，其中f1必须在模块命名空间中查找 - 和必须在.random。

所以从理论上讲，如果python没有在内部优化这种情况，我们应该看到一个优势。事实上，我们甚至可以在py3：

ch3ka@x200 /tmp % python2 aliastest.py
1.88513803482
1.4380030632
ch3ka@x200 /tmp % python3 aliastest.py
2.096395079046488
1.6709147160872817

因此，使用这种技术，您可以加快程序的速度 - 但是，这又取决于实现细节。

请注意，您也可以将f2写为：

def f3(rand = random.random, myrange = xrange):
    return sum((rand() for _ in myrange(10**5)))

它将在函数定义时绑定名称。不过，我认为加速的最大部分是由于备用的重复属性查找。

如何优化对库函数的多次调用（例如，random.random）？

有没有办法优化对`random.random()`的调用？

编辑（星期四，2014年11月27日20:45:26）

这是两个脚本（我编辑了文档字符串）：

编辑（2014年11月28日星期五17:50:09）

1 个答案:

如何优化对库函数的多次调用（例如，random.random）？

有没有办法优化对random.random()的调用？

编辑（星期四，2014年11月27日20:45:26）

这是两个脚本（我编辑了文档字符串）：

编辑（2014年11月28日星期五17:50:09）

1 个答案:

有没有办法优化对`random.random()`的调用？