Question

对于自我项目，我想做类似的事情：

Animal(somename, 3)

正如您所看到的，我并不是每个ID都需要不止一个Species（id）实例，但每次我创建一个Animal对象时，我都会创建一个id，我可能需要多次调用，例如(a == b) == (a is b)。

要解决这个问题，我要做的就是创建一个课程，以便在2个实例中，让我们说a和b，以下是永远的：

a = "hello"
b = "hello"
print(a is b)

这是Python用字符串文字做的事情，被称为实习。例如：

class MyClass(object):

    myHash = {} # This replicates the intern pool.

    def __new__(cls, n): # The default new method returns a new instance
        if n in MyClass.myHash:
            return MyClass.myHash[n]

        self = super(MyClass, cls).__new__(cls)
        self.__init(n)
        MyClass.myHash[n] = self

        return self

    # as pointed out on an answer, it's better to avoid initializating the instance 
    # with __init__, as that one's called even when returning an old instance.
    def __init(self, n): 
        self.n = n

a = MyClass(2)
b = MyClass(2)

print a is b # <<< True

该print将产生true（只要字符串足够短，如果我们直接使用python shell）。

我只能猜测CPython是如何做到这一点的（它可能涉及一些C魔法）所以我正在做我自己的版本。到目前为止，我已经：

     <select ng-change='taskTimer(duration)' ng-model='duration'>
     <option ng-repeat='TO in timeOptions'
          ng-selected='duration'>{{TO}}</option>
          {{TO}}</option>
     </select>

我的问题是：

a）我的问题是否值得解决？因为我想要的Species对象应该是非常轻的并且可以调用Animal的最大次数，而不是有限的（想象一个口袋妖怪游戏：不超过1000个实例，顶部）

b）如果是，这是解决我问题的有效方法吗？

c）如果它无效，请您详细说明一种更简单/更清洁/更Pythonic的解决方法吗？

Answer 1

是的，实现返回缓存对象的__new__方法是创建有限数量实例的适当方法。如果您不希望创建大量实例，则可以实现__eq__并按值而不是标识进行比较，但这样做并不会造成伤害。

请注意，不可变对象通常应在__new__而不是__init__中进行所有初始化，因为后者是在创建对象后调用的。此外，__init__将在从__new__返回的类的任何实例上调用，因此，当您进行缓存时，每次返回缓存对象时都会再次调用它。 / p>

此外，__new__的第一个参数是类对象而不是实例，因此您可能应将其命名为cls而不是self（您可以使用self代替如果你愿意，可以在方法的后面加instance！）。

Answer 2

为了使这一点尽可能通用，我将推荐一些东西。一，如果你愿意，可以继承namedtuple＆＃34; true＆＃34;不变性（通常人们会对此不屑一顾，但是当你进行实习时，打破不可变的不变量会导致更大的问题）。其次，使用锁来允许线程安全行为。

因为这相当复杂，我将提供Species代码的修改后的副本以及解释它的注释：

import collections
import operator
import threading

# Inheriting from a namedtuple is a convenient way to get immutability
class Species(collections.namedtuple('SpeciesBase', 'species_id height ...')):
    __slots__ = ()  # Prevent creation of arbitrary values on instances; true immutability of declared values from namedtuple makes true immutable instances

    # Lock and cache, with underscore prefixes to indicate they're internal details
    _cache_lock = threading.Lock()
    _cache = {} 

    def __new__(cls, species_id):  # Switching to canonical name cls for class type
        # Do quick fail fast check that ID is in fact an int/long
        # If it's int-like, this will force conversion to true int/long
        # and minimize risk of incompatible hash/equality checks in dict
        # lookup
        # I suspect that in CPython, this would actually remove the need
        # for the _cache_lock due to the GIL protecting you at the
        # critical stages (because no byte code is executing comparing
        # or hashing built-in int/long types), but the lock is a good idea
        # for correctness (avoiding reliance on implementation details)
        # and should cost little
        species_id = operator.index(species_id)

        # Lock when checking/mutating cache to make it thread safe
        try:
            with cls._cache_lock:
                return cls._cache[species_id]
        except KeyError:
            pass

        # Read in data here; not done under lock on assumption this might
        # be expensive and other Species (that already exist) might be
        # created/retrieved from cache during this time
        species_id = ...
        height = ...
        # Pass all the values read to the superclass (the namedtuple base)
        # constructor (which will set them and leave them immutable thereafter)
        self = super(Species, cls).__new__(cls, species_id, height, ...)

        with cls._cache_lock:
            # If someone tried to create the same species and raced
            # ahead of us, use their version, not ours to ensure uniqueness
            # If no one raced us, this will put our new object in the cache
            self = cls._cache.setdefault(species_id, self)
        return self

如果你想为普通图书馆实习（用户可能会被线程化，你不能相信他们不会破坏不变性不变），像上面这样的东西就是一个基本的结构。它很快，即使构造是重量级的，也可以最大限度地减少停顿的机会（换取可能不止一次重建一个对象，如果许多线程试图一次第一次构建它，则丢弃除一个副本之外的所有对象），等

当然，如果构造很便宜并且实例很小，那么只需写一个__eq__（如果它在逻辑上是不可变的话，可能__hash__并完成它。

试图为非字符串复制Python的字符串实习功能

2 个答案: