Question

我有一些包含许多类的python代码。我使用cProfile来查找运行程序的总时间是68秒。我发现名为Buyers的类中的以下函数大约需要60秒的68秒。我必须运行程序大约100次，所以任何速度的提高都会有所帮助。你能建议通过修改代码来提高速度吗？如果您需要更多有用的信息，请告诉我们。

def qtyDemanded(self, timePd, priceVector):
    '''Returns quantity demanded in period timePd. In addition,
    also updates the list of customers and non-customers.

    Inputs: timePd and priceVector
    Output: count of people for whom priceVector[-1] < utility
    '''

    ## Initialize count of customers to zero
    ## Set self.customers and self.nonCustomers to empty lists
    price = priceVector[-1]
    count = 0
    self.customers = []
    self.nonCustomers = []


    for person in self.people:
        if person.utility >= price:             
            person.customer = 1
            self.customers.append(person)
        else:
            person.customer = 0
            self.nonCustomers.append(person)

    return len(self.customers)

self.people是person个对象的列表。每个person都有customer和utility作为其属性。

编辑 - 已回复

-------------------------------------

非常感谢你的建议。这里是回应一些人们善意的问题和建议制作。我没有尝试过所有这些，但会尝试其他人并稍后再回复。

（1）@amber - 该函数被访问80,000次。

（2）@gnibbler和其他人 - self.people是内存中Person对象的列表。未连接到数据库。

（3）@Hugh Bothwell

原始功能拍摄的时间 - 60.8秒（访问80000次）

使用本地函数别名的新函数执行的时间 - 56.4秒（访问80000次）

（4）@rotoglup和@Martin Thomas

我还没有尝试过您的解决方案。我需要检查其余代码以查看我使用self.customers的地方，然后才能更改不将客户附加到self.customers列表。但我会尝试并回信。

（5）@TryPyPy - 感谢您提供的检查代码。

首先让我先阅读一下您所提出的建议，看看这些建议是否可行。

编辑2 有人建议，由于我在self.people标记了客户和非客户，我应该尝试使用append创建self.customers和self.noncustomers的单独列表。相反，我应该遍历self.people以找到客户的数量。我尝试了以下代码，并将f_w_append和f_wo_append下的两个函数计时。我确实发现后者花费的时间较少，但仍然是前者占用时间的96％。也就是说，速度的增加非常小。

@TryPyPy - 下面这段代码足以检查瓶颈功能，以防你的报价仍在那里与其他编译器一起检查。

再次感谢所有回复的人。

import numpy

class person(object):
    def __init__(self, util):
        self.utility = util
        self.customer = 0

class population(object):
    def __init__(self, numpeople):
        self.people = []
        self.cus = []
        self.noncus = []
        numpy.random.seed(1)
        utils = numpy.random.uniform(0, 300, numpeople)
        for u in utils:
            per = person(u)
            self.people.append(per)

popn = population(300)

def f_w_append():
    '''Function with append'''
    P = 75
    cus = []
    noncus = []
    for per in popn.people:
        if  per.utility >= P:
            per.customer = 1
            cus.append(per)
        else:
            per.customer = 0
            noncus.append(per)
    return len(cus)

def f_wo_append():
    '''Function without append'''
    P = 75
    for per in popn.people:
        if  per.utility >= P:
            per.customer = 1
        else:
            per.customer = 0

    numcustomers = 0
    for per in popn.people:
        if per.customer == 1:
            numcustomers += 1                
    return numcustomers

编辑3：问题看似numpy

这是对John Machin在下面所说的回应。下面你看到两种定义Population类的方法。我在下面运行了两次程序，每次创建Population类。一个使用numpy，一个不使用numpy。一个没有 numpy的时间与John在跑步中发现的时间相似。一个有numpy需要更长的时间。我不清楚的是popn实例是在时间记录开始之前创建的（至少它是从代码中出现的那个）。那么，为什么numpy版本需要更长的时间。并且，我认为numpy应该更有效率。无论如何，这个问题似乎与numpy有关，而不是附加，即使它确实减慢了一点点。有人可以用下面的代码确认吗？感谢。

import random # instead of numpy
import numpy
import time
timer_func = time.time # using Mac OS X 10.5.8

class Person(object):
    def __init__(self, util):
        self.utility = util
        self.customer = 0

class Population(object):
    def __init__(self, numpeople):
        random.seed(1)
        self.people = [Person(random.uniform(0, 300)) for i in xrange(numpeople)]
        self.cus = []
        self.noncus = []   

# Numpy based    
# class Population(object):
#     def __init__(self, numpeople):
#         numpy.random.seed(1)
#         utils = numpy.random.uniform(0, 300, numpeople)
#         self.people = [Person(u) for u in utils]
#         self.cus = []
#         self.noncus = []    


def f_wo_append(popn):
    '''Function without append'''
    P = 75
    for per in popn.people:
        if  per.utility >= P:
            per.customer = 1
        else:
            per.customer = 0

    numcustomers = 0
    for per in popn.people:
        if per.customer == 1:
            numcustomers += 1                
    return numcustomers



t0 = timer_func()
for i in xrange(20000):
    x = f_wo_append(popn)
t1 = timer_func()
print t1-t0

编辑4：查看John Machin和TryPyPy的答案

由于这里有很多编辑和更新，那些第一次发现自己的人可能会有点困惑。请参阅John Machin和TryPyPy的答案。这两者都可以帮助大大提高代码的速度。我很感激他们和其他提醒我append慢的人。因为，在这种情况下，我将使用John Machin的解决方案而不是使用numpy来生成实用程序，我接受他的回答作为答案。但是，我非常感谢TryPyPy指出的方向。

Answer 1

在优化Python代码以提高速度之后，您可以尝试许多方法。如果此程序不需要C扩展，则可以在PyPy下运行它以从其JIT编译器中受益。您可以尝试为C extension创建huge speedups。 Shed Skin甚至允许您将Python程序转换为独立的C ++二进制文件。

如果您能提供足够的代码进行基准测试，我愿意在这些不同的优化方案下计划您的程序，

编辑：首先，我必须同意其他人：你确定你正确地测量时间吗？示例代码在0.1秒内运行100次，因此很可能时间错误或者您有代码示例中不存在的瓶颈（IO？）。

那就是说，我做了30万人，所以时间一致。这是改编的代码，由CPython（2.5），PyPy和Shed Skin共享：

from time import time
import random
import sys


class person(object):
    def __init__(self, util):
        self.utility = util
        self.customer = 0


class population(object):
    def __init__(self, numpeople, util):
        self.people = []
        self.cus = []
        self.noncus = []
        for u in util:
            per = person(u)
            self.people.append(per)


def f_w_append(popn):
    '''Function with append'''
    P = 75
    cus = []
    noncus = []
    # Help CPython a bit
    # cus_append, noncus_append = cus.append, noncus.append
    for per in popn.people:
        if  per.utility >= P:
            per.customer = 1
            cus.append(per)
        else:
            per.customer = 0
            noncus.append(per)
    return len(cus)


def f_wo_append(popn):
    '''Function without append'''
    P = 75
    for per in popn.people:
        if  per.utility >= P:
            per.customer = 1
        else:
            per.customer = 0

    numcustomers = 0
    for per in popn.people:
        if per.customer == 1:
            numcustomers += 1
    return numcustomers


def main():
    try:
        numpeople = int(sys.argv[1])
    except:
        numpeople = 300000

    print "Running for %s people, 100 times." % numpeople

    begin = time()
    random.seed(1)
    # Help CPython a bit
    uniform = random.uniform
    util = [uniform(0.0, 300.0) for _ in xrange(numpeople)]
    # util = [random.uniform(0.0, 300.0) for _ in xrange(numpeople)]

    popn1 = population(numpeople, util)
    start = time()
    for _ in xrange(100):
        r = f_wo_append(popn1)
    print r
    print "Without append: %s" % (time() - start)


    popn2 = population(numpeople, util)
    start = time()
    for _ in xrange(100):
        r = f_w_append(popn2)
    print r
    print "With append: %s" % (time() - start)

    print "\n\nTotal time: %s" % (time() - begin)

if __name__ == "__main__":
    main()

使用PyPy运行就像使用CPython运行一样简单，只需输入'pypy'而不是'python'。对于Shed Skin，您必须转换为C ++，编译并运行：

shedskin -e makefaster.py && make 

# Check that you're using the makefaster.so file and run test
python -c "import makefaster; print makefaster.__file__; makefaster.main()"

这是Cython化的代码：

from time import time
import random
import sys


cdef class person:
    cdef readonly int utility
    cdef public int customer

    def __init__(self, util):
        self.utility = util
        self.customer = 0


class population(object):
    def __init__(self, numpeople, util):
        self.people = []
        self.cus = []
        self.noncus = []
        for u in util:
            per = person(u)
            self.people.append(per)


cdef int f_w_append(popn):
    '''Function with append'''
    cdef int P = 75
    cdef person per
    cus = []
    noncus = []
    # Help CPython a bit
    # cus_append, noncus_append = cus.append, noncus.append

    for per in popn.people:
        if  per.utility >= P:
            per.customer = 1
            cus.append(per)
        else:
            per.customer = 0
            noncus.append(per)
    cdef int lcus = len(cus)
    return lcus


cdef int f_wo_append(popn):
    '''Function without append'''
    cdef int P = 75
    cdef person per
    for per in popn.people:
        if  per.utility >= P:
            per.customer = 1
        else:
            per.customer = 0

    cdef int numcustomers = 0
    for per in popn.people:
        if per.customer == 1:
            numcustomers += 1
    return numcustomers


def main():

    cdef int i, r, numpeople
    cdef double _0, _300
    _0 = 0.0
    _300 = 300.0

    try:
        numpeople = int(sys.argv[1])
    except:
        numpeople = 300000

    print "Running for %s people, 100 times." % numpeople

    begin = time()
    random.seed(1)
    # Help CPython a bit
    uniform = random.uniform
    util = [uniform(_0, _300) for i in xrange(numpeople)]
    # util = [random.uniform(0.0, 300.0) for _ in xrange(numpeople)]

    popn1 = population(numpeople, util)
    start = time()
    for i in xrange(100):
        r = f_wo_append(popn1)
    print r
    print "Without append: %s" % (time() - start)


    popn2 = population(numpeople, util)
    start = time()
    for i in xrange(100):
        r = f_w_append(popn2)
    print r
    print "With append: %s" % (time() - start)

    print "\n\nTotal time: %s" % (time() - begin)

if __name__ == "__main__":
    main()

为了构建它，很高兴有一个像这样的setup.py：

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

ext_modules = [Extension("cymakefaster", ["makefaster.pyx"])]

setup(
  name = 'Python code to speed up',
  cmdclass = {'build_ext': build_ext},
  ext_modules = ext_modules
)

您使用以下内容构建它： python setupfaster.py build_ext --inplace

然后测试： python -c“import cymakefaster; print cymakefaster。 file ; cymakefaster.main（）”

每个版本都运行了五次计时，Cython是最快和最容易使用的代码生成器（Shed Skin的目标是更简单，但隐藏的错误消息和隐式静态类型使这里变得更难）。至于最佳价值，PyPy在计数器版本中提供了令人印象深刻的加速，没有代码更改。

#Results (time in seconds for 30000 people, 100 calls for each function):
                  Mean      Min  Times    
CPython 2.5.2
Without append: 35.037   34.518  35.124, 36.363, 34.518, 34.620, 34.559
With append:    29.251   29.126  29.339, 29.257, 29.259, 29.126, 29.272
Total time:     69.288   68.739  69.519, 70.614, 68.746, 68.739, 68.823

PyPy 1.4.1
Without append:  2.672    2.655   2.655,  2.670,  2.676,  2.690,  2.668
With append:    13.030   12.672  12.680, 12.725, 14.319, 12.755, 12.672
Total time:     16.551   16.194  16.196, 16.229, 17.840, 16.295, 16.194

Shed Skin 0.7 (gcc -O2)
Without append:  1.601    1.599   1.599,  1.605,  1.600,  1.602,  1.599
With append:     3.811    3.786   3.839,  3.795,  3.798,  3.786,  3.839
Total time:      5.704    5.677   5.715,  5.705,  5.699,  5.677,  5.726

Cython 0.14 (gcc -O2)
Without append:  1.692    1.673   1.673,  1.710,  1.678,  1.688,  1.711
With append:     3.087    3.067   3.079,  3.080,  3.119,  3.090,  3.067
Total time:      5.565    5.561   5.562,  5.561,  5.567,  5.562,  5.572

修改：Aaa和更有意义的时间，80000个电话，每人300人：

Results (time in seconds for 300 people, 80000 calls for each function):
                  Mean      Min  Times
CPython 2.5.2
Without append: 27.790   25.827  25.827, 27.315, 27.985, 28.211, 29.612
With append:    26.449   24.721  24.721, 27.017, 27.653, 25.576, 27.277
Total time:     54.243   50.550  50.550, 54.334, 55.652, 53.789, 56.892


Cython 0.14 (gcc -O2)
Without append:  1.819    1.760   1.760,  1.794,  1.843,  1.827,  1.871
With append:     2.089    2.063   2.100,  2.063,  2.098,  2.104,  2.078
Total time:      3.910    3.859   3.865,  3.859,  3.944,  3.934,  3.951

PyPy 1.4.1
Without append:  0.889    0.887   0.894,  0.888,  0.890,  0.888,  0.887
With append:     1.671    1.665   1.665,  1.666,  1.671,  1.673,  1.681
Total time:      2.561    2.555   2.560,  2.555,  2.561,  2.561,  2.569

Shed Skin 0.7 (g++ -O2)
Without append:  0.310    0.301   0.301,  0.308,  0.317,  0.320,  0.303
With append:     1.712    1.690   1.733,  1.700,  1.735,  1.690,  1.702
Total time:      2.027    2.008   2.035,  2.008,  2.052,  2.011,  2.029

Shed Skin变得最快，PyPy超越了Cython。与CPython相比，这三种速度都有很大的提升。

Answer 2

请考虑减少f_wo_append功能：

def f_wo_append():
    '''Function without append'''
    P = 75
    numcustomers = 0
    for person in popn.people:
        person.customer = iscust = person.utility >= P
        numcustomers += iscust
    return numcustomers

编辑以回应OP的评论“”“这使情况变得更糟！修剪版本的时间比我上面发布的版本多4倍。”“”

没有办法可以花费“4倍多”（5次？）......这是我的代码，它表明了“无追加”案例的显着减少，正如我所建议的那样，并且还介绍了“附加”案件的重大改进。

import random # instead of numpy
import time
timer_func = time.clock # better on Windows, use time.time on *x platform

class Person(object):
    def __init__(self, util):
        self.utility = util
        self.customer = 0

class Population(object):
    def __init__(self, numpeople):
        random.seed(1)
        self.people = [Person(random.uniform(0, 300)) for i in xrange(numpeople)]
        self.cus = []
        self.noncus = []        

def f_w_append(popn):
    '''Function with append'''
    P = 75
    cus = []
    noncus = []
    for per in popn.people:
        if  per.utility >= P:
            per.customer = 1
            cus.append(per)
        else:
            per.customer = 0
            noncus.append(per)
    popn.cus = cus # omitted from OP's code
    popn.noncus = noncus # omitted from OP's code
    return len(cus)

def f_w_append2(popn):
    '''Function with append'''
    P = 75
    popn.cus = []
    popn.noncus = []
    cusapp = popn.cus.append
    noncusapp = popn.noncus.append
    for per in popn.people:
        if  per.utility >= P:
            per.customer = 1
            cusapp(per)
        else:
            per.customer = 0
            noncusapp(per)
    return len(popn.cus)    

def f_wo_append(popn):
    '''Function without append'''
    P = 75
    for per in popn.people:
        if  per.utility >= P:
            per.customer = 1
        else:
            per.customer = 0

    numcustomers = 0
    for per in popn.people:
        if per.customer == 1:
            numcustomers += 1                
    return numcustomers

def f_wo_append2(popn):
    '''Function without append'''
    P = 75
    numcustomers = 0
    for person in popn.people:
        person.customer = iscust = person.utility >= P
        numcustomers += iscust
    return numcustomers    

if __name__ == "__main__":
    import sys
    popsize, which, niter = map(int, sys.argv[1:4])
    pop = Population(popsize)
    func = (f_w_append, f_w_append2, f_wo_append, f_wo_append2)[which]
    t0 = timer_func()
    for _unused in xrange(niter):
        nc = func(pop)
    t1 = timer_func()
    print "popsize=%d func=%s niter=%d nc=%d seconds=%.2f" % (
        popsize, func.__name__, niter, nc, t1 - t0)

以下是运行它的结果（Python 2.7.1，Windows 7 Pro，“Intel Core i3 CPU 540 @ 3.07 GHz”）：

C:\junk>\python27\python ncust.py 300 0 80000
popsize=300 func=f_w_append niter=80000 nc=218 seconds=5.48

C:\junk>\python27\python ncust.py 300 1 80000
popsize=300 func=f_w_append2 niter=80000 nc=218 seconds=4.62

C:\junk>\python27\python ncust.py 300 2 80000
popsize=300 func=f_wo_append niter=80000 nc=218 seconds=5.55

C:\junk>\python27\python ncust.py 300 3 80000
popsize=300 func=f_wo_append2 niter=80000 nc=218 seconds=4.29

编辑3 为什么numpy需要更长的时间：

>>> import numpy
>>> utils = numpy.random.uniform(0, 300, 10)
>>> print repr(utils[0])
42.777972538362874
>>> type(utils[0])
<type 'numpy.float64'>

这就是为什么我的f_wo_append2函数花了4倍的时间：

>>> x = utils[0]
>>> type(x)
<type 'numpy.float64'>
>>> type(x >= 75) 
<type 'numpy.bool_'> # iscust refers to a numpy.bool_
>>> type(0 + (x >= 75)) 
<type 'numpy.int32'> # numcustomers ends up referring to a numpy.int32
>>>

经验证据表明，这些自定义类型在用作标量时并不是那么快......也许是因为它们需要在每次使用时重置浮点硬件。适用于大型阵列，不适用于标量。

您使用的是其他任何numpy功能吗？如果没有，只需使用random模块即可。如果你对numpy有其他用途，你可能希望在人口设置期间强制numpy.float64到float。

Answer 3

您可以使用本地函数别名来消除一些查找：

def qtyDemanded(self, timePd, priceVector):
    '''Returns quantity demanded in period timePd. In addition,
    also updates the list of customers and non-customers.

    Inputs: timePd and priceVector
    Output: count of people for whom priceVector[-1] < utility
    '''
    price = priceVector[-1]
    self.customers = []
    self.nonCustomers = []

    # local function aliases
    addCust = self.customers.append
    addNonCust = self.nonCustomers.append

    for person in self.people:
        if person.utility >= price:             
            person.customer = 1
            addCust(person)
        else:
            person.customer = 0
            addNonCust(person)

    return len(self.customers)

Answer 4

根据您向self.people添加新元素或更改person.utility的频率，您可以考虑按self.people字段对utility进行排序。

然后，您可以使用bisect函数查找满足i_pivot条件的较低索引person[i_pivot].utility >= price。这将比您的穷举循环（O（N））

具有更低的复杂度（O（log N））

有了这些信息，您可以根据需要更新people列表：

您真的需要每次都更新utility字段吗？在排序的情况下，您可以在迭代时轻松推导出此值：例如，考虑按照增加顺序排列的列表，utility = (index >= i_pivot)

与customers和nonCustomers列表相同的问题。你为什么需要它们？它们可以由原始排序列表的切片替换：例如，customers = self.people[0:i_pivot]

所有这些都可以降低算法的复杂性，并使用更多内置（快速）Python函数，这可以加快实现速度。

Answer 5

此评论敲响警钟：

'''Returns quantity demanded in period timePd. In addition,
also updates the list of customers and non-customers.

除了函数中没有使用timePd之外，如果你真的只想返回数量，那就在函数中做。在另外的功能中执行“另外”的操作。

然后再次进行分析，看看你花费大部分时间在这两个功能中的哪一个。

我喜欢将SRP应用于方法和类：它使它们更容易测试。

Answer 6

我注意到一些奇怪的事情：

timePd作为参数传递但从未使用过

price是一个数组，但你只使用最后一个条目 - 为什么不在那里传递值而不是传递列表？

计数已初始化且从未使用过

self.people包含多个person对象，然后将这些对象复制到self.customers或self.noncustomers以及设置其customer标志。为什么不跳过复制操作，并在返回时，只是遍历列表，查看客户标志？这样可以节省昂贵的附加费。

或者，尝试使用psyco，它可以加速纯Python，有时相当大。

Answer 7

令人惊讶的是，所显示的功能是一个瓶颈，因为它相当简单。出于这个原因，我会仔细检查我的分析过程和结果。但是，如果它们是正确的，那么函数中最耗时的部分必须是它包含的for循环，因此有必要专注于加速它。一种方法是使用直线代码替换if/else。您还可以稍微减少append列表方法的属性查找。以下是这两件事的完成方式：

def qtyDemanded(self, timePd, priceVector):
    '''Returns quantity demanded in period timePd. In addition,
    also updates the list of customers and non-customers.

    Inputs: timePd and priceVector
    Output: count of people for whom priceVector[-1] < utility
    '''

    price = priceVector[-1] # last price
    kinds = [[], []] # initialize sublists of noncustomers and customers
    kindsAppend = [kinds[b].append for b in (False, True)] # append methods

    for person in self.people:
        person.customer = person.utility >= price  # customer test
        kindsAppend[person.customer](person)  # add to proper list

    self.nonCustomers = kinds[False]
    self.customers = kinds[True]

    return len(self.customers)

那就是说，我必须补充说，在每个人对象中都有一个customer标志似乎有点多余，而也会根据该属性将它们分别放入一个单独的列表中。不创建这两个列表当然会加快循环速度。

Answer 8

你要求猜测，而且大多数人都在猜测。

没有必要猜测。 Here's an example.

提高python代码的速度

8 个答案: