我正在使用Toby Segaran的Programming Collective Intelligence工作但是当从Py2转换到Py3时无法弄清楚我在哪里画这个错误。
def hcluster(rows, distance=pearson):
distances={}
currentclustid=-1
# Clusters are initially just the rows
clust=[bicluster(rows[i],id=i) for i in range(len(rows))]
while len(clust)>1:
lowestpair=(0,1)
closest=distance(clust[0].vec,clust[1].vec)
# Loop through every pair looking for the smallest distance
for i in range(len(clust)):
for j in range(i+1,len(clust)):
# Distances is the cache of distance calculations
if (clust[i].id,clust[j].id) not in distances:
distances[(clust[i].id,clust[j].id)] =distance(clust[i].vec,clust[j].vec)
d=distances[(clust[i].id,clust[j].id)]
if d<closest:
closest=d
lowestpair=(i,j)
# Calculate the average of the two clusters
mergevec=[
(clust[lowestpair[0]].vec[i]+clust[lowestpair[1]].vec[i])/2.0
for i in range(len(clust[0].vec))]
# Create the new cluster
newcluster=bicluster(mergevec,left=clust[lowestpair[0]],
right=clust[lowestpair[1]], distance=closest,
id=currentclustid)
# Cluster ids that weren't in the original set are negative
currentclustid-=1
del clust[lowestpair[1]]
del clust[lowestpair[0]]
clust.append(newcluster)
return clust[0]
Traceback (most recent call last):
File "<pyshell#9>", line 1, in <module>
clust=clusters.hcluster(data)
File "C:\Users\Boogz\AppData\Local\Programs\Python\Python35\Lib\site- packages\clusters.py", line 83, in hcluster
del clust[lowestpair[1]]
IndexError: list assignment index out of range
当我搜索“索引错误超出范围”时的另一个线程表明该人的错误是他们试图写入一个尚不存在的元素,但我看不到我在做什么。
答案 0 :(得分:0)
从简单地看逻辑,没有任何关于这实际上应该做什么的背景,没有完整的可运行的例子,并且没有关于引用的书的知识,我不得不猜一点,但我认为你的缩进是搞砸了起来。请看一下:
def hcluster(rows, distance=pearson):
distances={}
currentclustid=-1
# Clusters are initially just the rows
clust=[bicluster(rows[i],id=i) for i in range(len(rows))]
while len(clust)>1:
lowestpair=(0,1)
closest=distance(clust[0].vec,clust[1].vec)
# Loop through every pair looking for the smallest distance
for i in range(len(clust)):
for j in range(i+1,len(clust)):
# Distances is the cache of distance calculations
if (clust[i].id,clust[j].id) not in distances:
distances[(clust[i].id,clust[j].id)] =distance(clust[i].vec,clust[j].vec)
d=distances[(clust[i].id,clust[j].id)]
if d<closest:
closest=d
lowestpair=(i,j)
# the following two blocks should be be at the same indention level
# as the clust.append(newcluster) in the very end, as you are otherwise
# only overwriting the same thing unnecessarily each iteration.
# Calculate the average of the two clusters
mergevec=[
(clust[lowestpair[0]].vec[i]+clust[lowestpair[1]].vec[i])/2.0
for i in range(len(clust[0].vec))]
# Create the new cluster
newcluster=bicluster(mergevec,left=clust[lowestpair[0]],
right=clust[lowestpair[1]], distance=closest,
id=currentclustid)
# The following block should be outside the i and j loops
# otherwise you delete more then you append back
# thus clust gets too short to delete elements with index len(clust)-1
# Cluster ids that weren't in the original set are negative
currentclustid-=1
del clust[lowestpair[1]]
del clust[lowestpair[0]]
clust.append(newcluster)
return clust[0]
我认为应该为mergevec
中的每个组合创建lowestpair
。如果i
和j
,则为每个组合创建一个。这可能不是问题,但提示问题:
您正在内部for循环中删除。这显然会缩短clust
。但是你可以在for循环的开头得到它的长度。因此,如果您稍后删除,则列表太短,无法删除&#34; last&#34;该范围内的项目。