考虑以下代码片段:
import random
from uncertainties import unumpy, ufloat
x = [random.uniform(0,1) for p in range(1,8200)]
y = [random.randrange(0,1000) for p in range(1,8200)]
xerr = [random.uniform(0,1)/1000 for p in range(1,8200)]
yerr = [random.uniform(0,1)*10 for p in range(1,8200)]
x = unumpy.uarray(x, xerr)
y = unumpy.uarray(y, yerr)
diff = sum(x*y)
u = ufloat(0.0, 0.0)
for k in range(len(x)):
u+= (diff-x[k])**2 * y[k]
print(u)
如果我尝试在计算机上运行它,最多可能需要10分钟才能产生结果。我不太确定为什么会这样,并希望您提供某种解释。
如果我不得不猜测,我会说不确定性的计算出于某种原因比人们想像的要复杂,但是就像我说的那样,这只是一种猜测。有趣的是,如果最后删除print
指令,几乎可以立即完成代码,老实说,这使我感到困惑,多于它的帮助...
如果您不知道,this是不确定性库的存储库。
答案 0 :(得分:1)
我可以复制这个,印刷是永远的东西。或更确切地说,这是
转换为print隐式调用的字符串。
我使用line_profiler来度量__format__
的{{1}}函数的时间。 (由AffineScalarFunc
调用,由print调用)
我将阵列大小从8200减小到1000,以使其速度更快。结果就是(出于可读性考虑而修剪):
__str__
您可以看到几乎所有时间都是在1967行中计算出标准偏差的。如果深入研究,您会发现Timer unit: 1e-06 s
Total time: 29.1365 s
File: /home/veith/Projects/stackoverflow/test/lib/python3.6/site-packages/uncertainties/core.py
Function: __format__ at line 1813
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1813 @profile
1814 def __format__(self, format_spec):
1960
1961 # Since the '%' (percentage) format specification can change
1962 # the value to be displayed, this value must first be
1963 # calculated. Calculating the standard deviation is also an
1964 # optimization: the standard deviation is generally
1965 # calculated: it is calculated only once, here:
1966 1 2.0 2.0 0.0 nom_val = self.nominal_value
1967 1 29133097.0 29133097.0 100.0 std_dev = self.std_dev
1968
属性是问题,其中error_components
属性是问题,其中derivatives
是问题。如果您对此进行了概述,那么您将开始探究问题的根源。这里的大多数工作是平均分配的:
_linear_part.expand()
您会看到有{strong>很多呼叫Function: expand at line 1481
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1481 @profile
1482 def expand(self):
1483 """
1484 Expand the linear combination.
1485
1486 The expansion is a collections.defaultdict(float).
1487
1488 This should only be called if the linear combination is not
1489 yet expanded.
1490 """
1491
1492 # The derivatives are built progressively by expanding each
1493 # term of the linear combination until there is no linear
1494 # combination to be expanded.
1495
1496 # Final derivatives, constructed progressively:
1497 1 2.0 2.0 0.0 derivatives = collections.defaultdict(float)
1498
1499 15995999 4942237.0 0.3 9.7 while self.linear_combo: # The list of terms is emptied progressively
1500
1501 # One of the terms is expanded or, if no expansion is
1502 # needed, simply added to the existing derivatives.
1503 #
1504 # Optimization note: since Python's operations are
1505 # left-associative, a long sum of Variables can be built
1506 # such that the last term is essentially a Variable (and
1507 # not a NestedLinearCombination): popping from the
1508 # remaining terms allows this term to be quickly put in
1509 # the final result, which limits the number of terms
1510 # remaining (and whose size can temporarily grow):
1511 15995998 6235033.0 0.4 12.2 (main_factor, main_expr) = self.linear_combo.pop()
1512
1513 # print "MAINS", main_factor, main_expr
1514
1515 15995998 10572206.0 0.7 20.8 if main_expr.expanded():
1516 15992002 6822093.0 0.4 13.4 for (var, factor) in main_expr.linear_combo.items():
1517 7996001 8070250.0 1.0 15.8 derivatives[var] += main_factor*factor
1518
1519 else: # Non-expanded form
1520 23995993 8084949.0 0.3 15.9 for (factor, expr) in main_expr.linear_combo:
1521 # The main_factor is applied to expr:
1522 15995996 6208091.0 0.4 12.2 self.linear_combo.append((main_factor*factor, expr))
1523
1524 # print "DERIV", derivatives
1525
1526 1 2.0 2.0 0.0 self.linear_combo = derivatives
,其中呼叫expanded
,which is slow。
还要注意注释,这些注释暗示该库实际上仅在需要时才计算导数(并且知道否则确实很慢)。这就是为什么转换到字符串需要如此长的时间,而之前却没有花费时间的原因。
在isinstance
的{{1}}中:
__init__
在AffineScalarFunc
的{{1}}中:
# In order to have a linear execution time for long sums, the
# _linear_part is generally left as is (otherwise, each
# successive term would expand to a linearly growing sum of
# terms: efficiently handling such terms [so, without copies]
# is not obvious, when the algorithm should work for all
# functions beyond sums).
在std_dev
的{{1}}中:
AffineScalarFunc
总而言之,这在某种程度上是可以预料的,因为该库处理这些非本地数字,这些数字需要大量操作才能处理(显然)。