Python, being the dynamic language that it is, offer multiple ways to implement the same feature. These options may vary in readability, maintainability and performance. Even though the usual scripts that I write in Python are of a disposable nature, I now have a certain project that I am working on (academic) that must be readable, maintainable and perform reasonably well. Since I haven't done any serious coding in Python before, including any sort of profiling, I need help in deciding the balance between the three factors I mentioned above.
Here's a code snippet from one of the modules in a scientific package that I am working on. It is an n-ary Tree class with a very basic skeleton structure. This was written with inheritance and sub classing in mind.
Note : in the code below a tree is the same thing as a node. Every tree is an instance of the same class Tree.
class Tree(object):
def __init__(self, parent=None, value=None):
self.parent = parent
self.value = value
self.children = set()
The two functions below belongs to this class (along with many others)
def isexternal(self):
"""Return True if this is an external tree."""
return not bool(self.children)
def isleaf(self):
"""Return True if this is a leaf tree."""
return not bool(self.children)
Both these functions are doing exactly the same thing - they are just two different names. So, why not change it to something like:
def isleaf(self):
"""Return True of this is a leaf tree."""
return self.isexternal()
My doubts are these :
I've read that function calls in Python are rather expensive (creating new stacks for each call), but I don't know if it is a good or bad thing if one function depends on another. How will it affect maintainability. This happens many times in my code, where I call one method from another method to avoid code duplication. Is it bad practice to do this?
Here's another example of this code duplication scenario in the same class:
def isancestor(self, tree):
"""Return True if this tree is an ancestor of the specified tree."""
return tree.parent is self or (not tree.isroot()
and self.isancestor(tree.parent))
def isdescendant(self, tree):
"""Return True if this tree is a descendant of the specified tree."""
return self.parent is tree or (not self.isroot()
and self.parent.isdescendant(tree))
I could instead go for:
def isdescendant(self, tree):
"""Return True if this tree is a descendant of the specified tree."""
return tree.isancestor(self)
答案 0 :(得分:21)
Very broadly speaking, there are two types of optimization: macro optimizations and micro optimizations. Macro optimizations include things like your choice of algorithms, deciding between different data structures, and the like. Things that can have a big impact on performance and often have large ripple effects on your code base if you change your mind. Switching from a data structure with linear O(n) to one with constant O(1) inserts could be a huge win and well worth the cost of doing it. Adding caching may change a dog slow algorithm into a lightning fast one.
Micro optimizations are things like eliding or inlining function calls, eliminating or adding variables, caching calculation results for a very short window, unrolling loops, etc. As a rule, you should forget about these types of optimizations and focus on the readability and maintainability of your code. The effects of micro optimizations are simply too small to be worth it.
You should only consider these types of changes after profiling your code. If you can identify a critical loop that would benefit from such an optimization, and your profiling confirms it would, and you make the change and verify the improvement worked with another round of profiling--then you should micro optimize.
But until then, don't sweat the small stuff.
def isdescendant(self, tree):
"""Return True if this tree is a descendant of the specified tree."""
return tree.isancestor(self)
I would absolutely recommend this type of code reuse. It makes it crystal clear that isdescendant
is the inverse of isancestor
. It ensures that both functions work the same way so you can't inadvertantly introduce a bug in one but not the other.
def isleaf(self):
"""Return True of this is a leaf tree."""
return self.isexternal()
Here I would ask myself if isleaf
and isexternal
are conceptually the same. Ignoring that they're implemented the same, are they logically identical? If so, I would have one call the other. If it's just happenstance that they have the same implementation, I might duplicate the code. Can you imagine a scenario where you would want to change one function but not the other? That would point towards duplication.
答案 1 :(得分:5)
This approach works well without introducing additional stack frames.
def isexternal(self):
"""Return True of this is an external tree."""
return not bool(self.children)
isleaf = isexternal
In your second case, the algorithms are fundamentally different between the two methods. I think the solution you propose is fine.
答案 2 :(得分:2)
Just a small test:
>>> timeit('a()', setup="def a(): pass")
0.08267402648925781
>>> timeit('1+1')
0.03854799270629883
So, a simple function call has less than 2.5 times running time compared to a simple arithmetic expression. I don't think, it counts as "rather expensive".
答案 3 :(得分:1)
David Hess's answer is good...
Except, it's neither optimal nor canonical Python to say var w = 0;
$(".pages button").each(function () {
w = w + $(this).outerWidth(true)
});
.
not bool(x)
gives exactly the same results, and is one global lookup and one function call cheaper.
Also, if you're using not x
twice in the same call, you might consider whether you want to put it in a local -- self.parent
-- because local lookups are a lot cheaper than instance lookups. Of course, you should run timeit to make sure you're getting a benefit.