我有一些包含许多类的python代码。我使用cProfile
来查找运行程序的总时间是68秒。我发现名为Buyers
的类中的以下函数大约需要60秒的68秒。我必须运行程序大约100次,所以任何速度的提高都会有所帮助。你能建议通过修改代码来提高速度吗?如果您需要更多有用的信息,请告诉我们。
def qtyDemanded(self, timePd, priceVector):
'''Returns quantity demanded in period timePd. In addition,
also updates the list of customers and non-customers.
Inputs: timePd and priceVector
Output: count of people for whom priceVector[-1] < utility
'''
## Initialize count of customers to zero
## Set self.customers and self.nonCustomers to empty lists
price = priceVector[-1]
count = 0
self.customers = []
self.nonCustomers = []
for person in self.people:
if person.utility >= price:
person.customer = 1
self.customers.append(person)
else:
person.customer = 0
self.nonCustomers.append(person)
return len(self.customers)
self.people
是person
个对象的列表。每个person
都有customer
和utility
作为其属性。
编辑 - 已回复
-------------------------------------
非常感谢你的建议。这里是 回应一些人们善意的问题和建议 制作。我没有尝试过所有这些,但会尝试其他人并稍后再回复。
(1)@amber - 该函数被访问80,000次。
(2)@gnibbler和其他人 - self.people是内存中Person对象的列表。未连接到数据库。
(3)@Hugh Bothwell
原始功能拍摄的时间 - 60.8秒(访问80000次)
使用本地函数别名的新函数执行的时间 - 56.4秒(访问80000次)
(4)@rotoglup和@Martin Thomas
我还没有尝试过您的解决方案。我需要检查其余代码以查看我使用self.customers的地方,然后才能更改不将客户附加到self.customers列表。但我会尝试并回信。
(5)@TryPyPy - 感谢您提供的检查代码。
首先让我先阅读一下您所提出的建议,看看这些建议是否可行。
编辑2
有人建议,由于我在self.people
标记了客户和非客户,我应该尝试使用append创建self.customers
和self.noncustomers
的单独列表。相反,我应该遍历self.people
以找到客户的数量。我尝试了以下代码,并将f_w_append
和f_wo_append
下的两个函数计时。我确实发现后者花费的时间较少,但仍然是前者占用时间的96%。也就是说,速度的增加非常小。
@TryPyPy - 下面这段代码足以检查瓶颈功能,以防你的报价仍在那里与其他编译器一起检查。
再次感谢所有回复的人。
import numpy
class person(object):
def __init__(self, util):
self.utility = util
self.customer = 0
class population(object):
def __init__(self, numpeople):
self.people = []
self.cus = []
self.noncus = []
numpy.random.seed(1)
utils = numpy.random.uniform(0, 300, numpeople)
for u in utils:
per = person(u)
self.people.append(per)
popn = population(300)
def f_w_append():
'''Function with append'''
P = 75
cus = []
noncus = []
for per in popn.people:
if per.utility >= P:
per.customer = 1
cus.append(per)
else:
per.customer = 0
noncus.append(per)
return len(cus)
def f_wo_append():
'''Function without append'''
P = 75
for per in popn.people:
if per.utility >= P:
per.customer = 1
else:
per.customer = 0
numcustomers = 0
for per in popn.people:
if per.customer == 1:
numcustomers += 1
return numcustomers
编辑3:问题看似numpy
这是对John Machin在下面所说的回应。下面你看到两种定义Population
类的方法。我在下面运行了两次程序,每次创建Population
类。一个使用numpy,一个不使用numpy。一个没有 numpy的时间与John在跑步中发现的时间相似。一个有numpy需要更长的时间。我不清楚的是popn
实例是在时间记录开始之前创建的(至少它是从代码中出现的那个)。那么,为什么numpy版本需要更长的时间。并且,我认为numpy应该更有效率。无论如何,这个问题似乎与numpy有关,而不是附加,即使它确实减慢了一点点。有人可以用下面的代码确认吗?感谢。
import random # instead of numpy
import numpy
import time
timer_func = time.time # using Mac OS X 10.5.8
class Person(object):
def __init__(self, util):
self.utility = util
self.customer = 0
class Population(object):
def __init__(self, numpeople):
random.seed(1)
self.people = [Person(random.uniform(0, 300)) for i in xrange(numpeople)]
self.cus = []
self.noncus = []
# Numpy based
# class Population(object):
# def __init__(self, numpeople):
# numpy.random.seed(1)
# utils = numpy.random.uniform(0, 300, numpeople)
# self.people = [Person(u) for u in utils]
# self.cus = []
# self.noncus = []
def f_wo_append(popn):
'''Function without append'''
P = 75
for per in popn.people:
if per.utility >= P:
per.customer = 1
else:
per.customer = 0
numcustomers = 0
for per in popn.people:
if per.customer == 1:
numcustomers += 1
return numcustomers
t0 = timer_func()
for i in xrange(20000):
x = f_wo_append(popn)
t1 = timer_func()
print t1-t0
编辑4:查看John Machin和TryPyPy的答案
由于这里有很多编辑和更新,那些第一次发现自己的人可能会有点困惑。请参阅John Machin和TryPyPy的答案。这两者都可以帮助大大提高代码的速度。我很感激他们和其他提醒我append
慢的人。因为,在这种情况下,我将使用John Machin的解决方案而不是使用numpy来生成实用程序,我接受他的回答作为答案。但是,我非常感谢TryPyPy指出的方向。
答案 0 :(得分:5)
在优化Python代码以提高速度之后,您可以尝试许多方法。如果此程序不需要C扩展,则可以在PyPy下运行它以从其JIT编译器中受益。您可以尝试为C extension创建huge speedups。 Shed Skin甚至允许您将Python程序转换为独立的C ++二进制文件。
如果您能提供足够的代码进行基准测试,我愿意在这些不同的优化方案下计划您的程序,
编辑:首先,我必须同意其他人:你确定你正确地测量时间吗?示例代码在0.1秒内运行100次,因此很可能时间错误或者您有代码示例中不存在的瓶颈(IO?)。
那就是说,我做了30万人,所以时间一致。这是改编的代码,由CPython(2.5),PyPy和Shed Skin共享:
from time import time
import random
import sys
class person(object):
def __init__(self, util):
self.utility = util
self.customer = 0
class population(object):
def __init__(self, numpeople, util):
self.people = []
self.cus = []
self.noncus = []
for u in util:
per = person(u)
self.people.append(per)
def f_w_append(popn):
'''Function with append'''
P = 75
cus = []
noncus = []
# Help CPython a bit
# cus_append, noncus_append = cus.append, noncus.append
for per in popn.people:
if per.utility >= P:
per.customer = 1
cus.append(per)
else:
per.customer = 0
noncus.append(per)
return len(cus)
def f_wo_append(popn):
'''Function without append'''
P = 75
for per in popn.people:
if per.utility >= P:
per.customer = 1
else:
per.customer = 0
numcustomers = 0
for per in popn.people:
if per.customer == 1:
numcustomers += 1
return numcustomers
def main():
try:
numpeople = int(sys.argv[1])
except:
numpeople = 300000
print "Running for %s people, 100 times." % numpeople
begin = time()
random.seed(1)
# Help CPython a bit
uniform = random.uniform
util = [uniform(0.0, 300.0) for _ in xrange(numpeople)]
# util = [random.uniform(0.0, 300.0) for _ in xrange(numpeople)]
popn1 = population(numpeople, util)
start = time()
for _ in xrange(100):
r = f_wo_append(popn1)
print r
print "Without append: %s" % (time() - start)
popn2 = population(numpeople, util)
start = time()
for _ in xrange(100):
r = f_w_append(popn2)
print r
print "With append: %s" % (time() - start)
print "\n\nTotal time: %s" % (time() - begin)
if __name__ == "__main__":
main()
使用PyPy运行就像使用CPython运行一样简单,只需输入'pypy'而不是'python'。对于Shed Skin,您必须转换为C ++,编译并运行:
shedskin -e makefaster.py && make
# Check that you're using the makefaster.so file and run test
python -c "import makefaster; print makefaster.__file__; makefaster.main()"
这是Cython化的代码:
from time import time
import random
import sys
cdef class person:
cdef readonly int utility
cdef public int customer
def __init__(self, util):
self.utility = util
self.customer = 0
class population(object):
def __init__(self, numpeople, util):
self.people = []
self.cus = []
self.noncus = []
for u in util:
per = person(u)
self.people.append(per)
cdef int f_w_append(popn):
'''Function with append'''
cdef int P = 75
cdef person per
cus = []
noncus = []
# Help CPython a bit
# cus_append, noncus_append = cus.append, noncus.append
for per in popn.people:
if per.utility >= P:
per.customer = 1
cus.append(per)
else:
per.customer = 0
noncus.append(per)
cdef int lcus = len(cus)
return lcus
cdef int f_wo_append(popn):
'''Function without append'''
cdef int P = 75
cdef person per
for per in popn.people:
if per.utility >= P:
per.customer = 1
else:
per.customer = 0
cdef int numcustomers = 0
for per in popn.people:
if per.customer == 1:
numcustomers += 1
return numcustomers
def main():
cdef int i, r, numpeople
cdef double _0, _300
_0 = 0.0
_300 = 300.0
try:
numpeople = int(sys.argv[1])
except:
numpeople = 300000
print "Running for %s people, 100 times." % numpeople
begin = time()
random.seed(1)
# Help CPython a bit
uniform = random.uniform
util = [uniform(_0, _300) for i in xrange(numpeople)]
# util = [random.uniform(0.0, 300.0) for _ in xrange(numpeople)]
popn1 = population(numpeople, util)
start = time()
for i in xrange(100):
r = f_wo_append(popn1)
print r
print "Without append: %s" % (time() - start)
popn2 = population(numpeople, util)
start = time()
for i in xrange(100):
r = f_w_append(popn2)
print r
print "With append: %s" % (time() - start)
print "\n\nTotal time: %s" % (time() - begin)
if __name__ == "__main__":
main()
为了构建它,很高兴有一个像这样的setup.py:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
ext_modules = [Extension("cymakefaster", ["makefaster.pyx"])]
setup(
name = 'Python code to speed up',
cmdclass = {'build_ext': build_ext},
ext_modules = ext_modules
)
您使用以下内容构建它: python setupfaster.py build_ext --inplace
然后测试: python -c“import cymakefaster; print cymakefaster。 file ; cymakefaster.main()”
每个版本都运行了五次计时,Cython是最快和最容易使用的代码生成器(Shed Skin的目标是更简单,但隐藏的错误消息和隐式静态类型使这里变得更难)。至于最佳价值,PyPy在计数器版本中提供了令人印象深刻的加速,没有代码更改。
#Results (time in seconds for 30000 people, 100 calls for each function):
Mean Min Times
CPython 2.5.2
Without append: 35.037 34.518 35.124, 36.363, 34.518, 34.620, 34.559
With append: 29.251 29.126 29.339, 29.257, 29.259, 29.126, 29.272
Total time: 69.288 68.739 69.519, 70.614, 68.746, 68.739, 68.823
PyPy 1.4.1
Without append: 2.672 2.655 2.655, 2.670, 2.676, 2.690, 2.668
With append: 13.030 12.672 12.680, 12.725, 14.319, 12.755, 12.672
Total time: 16.551 16.194 16.196, 16.229, 17.840, 16.295, 16.194
Shed Skin 0.7 (gcc -O2)
Without append: 1.601 1.599 1.599, 1.605, 1.600, 1.602, 1.599
With append: 3.811 3.786 3.839, 3.795, 3.798, 3.786, 3.839
Total time: 5.704 5.677 5.715, 5.705, 5.699, 5.677, 5.726
Cython 0.14 (gcc -O2)
Without append: 1.692 1.673 1.673, 1.710, 1.678, 1.688, 1.711
With append: 3.087 3.067 3.079, 3.080, 3.119, 3.090, 3.067
Total time: 5.565 5.561 5.562, 5.561, 5.567, 5.562, 5.572
修改:Aaa和更有意义的时间,80000个电话,每人300人:
Results (time in seconds for 300 people, 80000 calls for each function):
Mean Min Times
CPython 2.5.2
Without append: 27.790 25.827 25.827, 27.315, 27.985, 28.211, 29.612
With append: 26.449 24.721 24.721, 27.017, 27.653, 25.576, 27.277
Total time: 54.243 50.550 50.550, 54.334, 55.652, 53.789, 56.892
Cython 0.14 (gcc -O2)
Without append: 1.819 1.760 1.760, 1.794, 1.843, 1.827, 1.871
With append: 2.089 2.063 2.100, 2.063, 2.098, 2.104, 2.078
Total time: 3.910 3.859 3.865, 3.859, 3.944, 3.934, 3.951
PyPy 1.4.1
Without append: 0.889 0.887 0.894, 0.888, 0.890, 0.888, 0.887
With append: 1.671 1.665 1.665, 1.666, 1.671, 1.673, 1.681
Total time: 2.561 2.555 2.560, 2.555, 2.561, 2.561, 2.569
Shed Skin 0.7 (g++ -O2)
Without append: 0.310 0.301 0.301, 0.308, 0.317, 0.320, 0.303
With append: 1.712 1.690 1.733, 1.700, 1.735, 1.690, 1.702
Total time: 2.027 2.008 2.035, 2.008, 2.052, 2.011, 2.029
Shed Skin变得最快,PyPy超越了Cython。与CPython相比,这三种速度都有很大的提升。
答案 1 :(得分:4)
请考虑减少f_wo_append
功能:
def f_wo_append():
'''Function without append'''
P = 75
numcustomers = 0
for person in popn.people:
person.customer = iscust = person.utility >= P
numcustomers += iscust
return numcustomers
编辑以回应OP的评论“”“这使情况变得更糟!修剪版本的时间比我上面发布的版本多4倍。”“”
没有办法可以花费“4倍多”(5次?)......这是我的代码,它表明了“无追加”案例的显着减少,正如我所建议的那样,并且还介绍了“附加”案件的重大改进。
import random # instead of numpy
import time
timer_func = time.clock # better on Windows, use time.time on *x platform
class Person(object):
def __init__(self, util):
self.utility = util
self.customer = 0
class Population(object):
def __init__(self, numpeople):
random.seed(1)
self.people = [Person(random.uniform(0, 300)) for i in xrange(numpeople)]
self.cus = []
self.noncus = []
def f_w_append(popn):
'''Function with append'''
P = 75
cus = []
noncus = []
for per in popn.people:
if per.utility >= P:
per.customer = 1
cus.append(per)
else:
per.customer = 0
noncus.append(per)
popn.cus = cus # omitted from OP's code
popn.noncus = noncus # omitted from OP's code
return len(cus)
def f_w_append2(popn):
'''Function with append'''
P = 75
popn.cus = []
popn.noncus = []
cusapp = popn.cus.append
noncusapp = popn.noncus.append
for per in popn.people:
if per.utility >= P:
per.customer = 1
cusapp(per)
else:
per.customer = 0
noncusapp(per)
return len(popn.cus)
def f_wo_append(popn):
'''Function without append'''
P = 75
for per in popn.people:
if per.utility >= P:
per.customer = 1
else:
per.customer = 0
numcustomers = 0
for per in popn.people:
if per.customer == 1:
numcustomers += 1
return numcustomers
def f_wo_append2(popn):
'''Function without append'''
P = 75
numcustomers = 0
for person in popn.people:
person.customer = iscust = person.utility >= P
numcustomers += iscust
return numcustomers
if __name__ == "__main__":
import sys
popsize, which, niter = map(int, sys.argv[1:4])
pop = Population(popsize)
func = (f_w_append, f_w_append2, f_wo_append, f_wo_append2)[which]
t0 = timer_func()
for _unused in xrange(niter):
nc = func(pop)
t1 = timer_func()
print "popsize=%d func=%s niter=%d nc=%d seconds=%.2f" % (
popsize, func.__name__, niter, nc, t1 - t0)
以下是运行它的结果(Python 2.7.1,Windows 7 Pro,“Intel Core i3 CPU 540 @ 3.07 GHz”):
C:\junk>\python27\python ncust.py 300 0 80000
popsize=300 func=f_w_append niter=80000 nc=218 seconds=5.48
C:\junk>\python27\python ncust.py 300 1 80000
popsize=300 func=f_w_append2 niter=80000 nc=218 seconds=4.62
C:\junk>\python27\python ncust.py 300 2 80000
popsize=300 func=f_wo_append niter=80000 nc=218 seconds=5.55
C:\junk>\python27\python ncust.py 300 3 80000
popsize=300 func=f_wo_append2 niter=80000 nc=218 seconds=4.29
编辑3 为什么numpy需要更长的时间:
>>> import numpy
>>> utils = numpy.random.uniform(0, 300, 10)
>>> print repr(utils[0])
42.777972538362874
>>> type(utils[0])
<type 'numpy.float64'>
这就是为什么我的f_wo_append2函数花了4倍的时间:
>>> x = utils[0]
>>> type(x)
<type 'numpy.float64'>
>>> type(x >= 75)
<type 'numpy.bool_'> # iscust refers to a numpy.bool_
>>> type(0 + (x >= 75))
<type 'numpy.int32'> # numcustomers ends up referring to a numpy.int32
>>>
经验证据表明,这些自定义类型在用作标量时并不是那么快......也许是因为它们需要在每次使用时重置浮点硬件。适用于大型阵列,不适用于标量。
您使用的是其他任何numpy功能吗?如果没有,只需使用random
模块即可。如果你对numpy有其他用途,你可能希望在人口设置期间强制numpy.float64
到float
。
答案 2 :(得分:1)
您可以使用本地函数别名来消除一些查找:
def qtyDemanded(self, timePd, priceVector):
'''Returns quantity demanded in period timePd. In addition,
also updates the list of customers and non-customers.
Inputs: timePd and priceVector
Output: count of people for whom priceVector[-1] < utility
'''
price = priceVector[-1]
self.customers = []
self.nonCustomers = []
# local function aliases
addCust = self.customers.append
addNonCust = self.nonCustomers.append
for person in self.people:
if person.utility >= price:
person.customer = 1
addCust(person)
else:
person.customer = 0
addNonCust(person)
return len(self.customers)
答案 3 :(得分:1)
根据您向self.people
添加新元素或更改person.utility
的频率,您可以考虑按self.people
字段对utility
进行排序。
然后,您可以使用bisect
函数查找满足i_pivot
条件的较低索引person[i_pivot].utility >= price
。这将比您的穷举循环(O(N))
有了这些信息,您可以根据需要更新people
列表:
您真的需要每次都更新utility
字段吗?在排序的情况下,您可以在迭代时轻松推导出此值:例如,考虑按照增加顺序排列的列表,utility = (index >= i_pivot)
与customers
和nonCustomers
列表相同的问题。你为什么需要它们?它们可以由原始排序列表的切片替换:例如,customers = self.people[0:i_pivot]
所有这些都可以降低算法的复杂性,并使用更多内置(快速)Python函数,这可以加快实现速度。
答案 4 :(得分:1)
此评论敲响警钟:
'''Returns quantity demanded in period timePd. In addition,
also updates the list of customers and non-customers.
除了函数中没有使用timePd
之外,如果你真的只想返回数量,那就在函数中做。在另外的功能中执行“另外”的操作。
然后再次进行分析,看看你花费大部分时间在这两个功能中的哪一个。
我喜欢将SRP应用于方法和类:它使它们更容易测试。
答案 5 :(得分:0)
我注意到一些奇怪的事情:
timePd作为参数传递但从未使用过
price是一个数组,但你只使用最后一个条目 - 为什么不在那里传递值而不是传递列表?
计数已初始化且从未使用过
self.people包含多个person对象,然后将这些对象复制到self.customers或self.noncustomers以及设置其customer标志。为什么不跳过复制操作,并在返回时,只是遍历列表,查看客户标志?这样可以节省昂贵的附加费。
或者,尝试使用psyco,它可以加速纯Python,有时相当大。
答案 6 :(得分:0)
令人惊讶的是,所显示的功能是一个瓶颈,因为它相当简单。出于这个原因,我会仔细检查我的分析过程和结果。但是,如果它们是正确的,那么函数中最耗时的部分必须是它包含的for
循环,因此有必要专注于加速它。一种方法是使用直线代码替换if/else
。您还可以稍微减少append
列表方法的属性查找。以下是这两件事的完成方式:
def qtyDemanded(self, timePd, priceVector):
'''Returns quantity demanded in period timePd. In addition,
also updates the list of customers and non-customers.
Inputs: timePd and priceVector
Output: count of people for whom priceVector[-1] < utility
'''
price = priceVector[-1] # last price
kinds = [[], []] # initialize sublists of noncustomers and customers
kindsAppend = [kinds[b].append for b in (False, True)] # append methods
for person in self.people:
person.customer = person.utility >= price # customer test
kindsAppend[person.customer](person) # add to proper list
self.nonCustomers = kinds[False]
self.customers = kinds[True]
return len(self.customers)
那就是说,我必须补充说,在每个人对象中都有一个customer
标志似乎有点多余,而也会根据该属性将它们分别放入一个单独的列表中。不创建这两个列表当然会加快循环速度。
答案 7 :(得分:0)
你要求猜测,而且大多数人都在猜测。
没有必要猜测。 Here's an example.