我有一个包含100个Cluster对象的字典,这些集群有几个Member对象,我需要将它们添加到它们所属的集群中,我的问题是每个成员都被添加到每个集群中,我无法找到为什么。这是代码
self.clusters = {}
with open('/tmp/numpy_dumps/kmeansInput.txt.cluster_centres') as f:
for line in f:
cluster = Cluster(line)
self.clusters[cluster.id] = cluster
with open('/tmp/numpy_dumps/kmeansInput.txt.membership') as f:
for line in f:
member = Member(line, self.reps)
self.clusters[member.clusterId].members[member.imageId] = member
for id, cluster in self.clusters.items():
print(cluster)
print(cluster.members)
print('cluster {} has {} members'.format(id, len(cluster.members)))
输出告诉我每个群集都有所有成员
答案 0 :(得分:1)
问题非常肯定在Cluster
课程中,你没有在你的片段中发帖。这是一个疯狂的猜测,但这种行为是共享属性的典型特征,无论是类属性还是可变的默认参数。如果您的Cluster
类看起来像下面的其中一个片段,那么就不会再看了:
# class attributes:
class Cluster(object):
members = {} # this will be shared by all instances
# solution:
class Cluster(object):
def __init__(self):
self.members = {} # this will be per instance
# default mutable argument:
class Cluster(object):
def __init__(self, members={}):
# this one is well known gotcha:
# the default for the `members` arg is eval'd only once
# so all instances created without an explicit
# `members` arg will share the same `members` dict
self.members = members
# solution:
class Cluster(object):
def __init__(self, members=None):
if members is None:
members = {}
self.members = members