奇怪的python错误

时间:2010-03-16 19:41:14

标签: python

我正在尝试编写一个计算直方图的python程序,给出一个数字列表,如:

1
3
2
3
4
5
3.2
4
2
2

所以输入参数是文件名和间隔数。

程序代码是:

#!/usr/bin/env python
import os, sys, re, string, array, math
import numpy

Lista = []

db = sys.argv[1] 
db_file = open(db,"r")
ic=0
nintervals= int(sys.argv[2])

while 1:
    line = db_file.readline()
    if not line:
        break
    ll=string.split(line)
    #print ll[6]
    Lista.insert(ic,float(ll[0]))
    ic=ic+1

lmin=min(Lista)
print "min= ",lmin
lmax=max(Lista)
print "max= ",lmax

width=666.666
width=(lmax-lmin)/nintervals
print "width= ",width

nelements=len(Lista)
print "nelements= ",nelements
print " "
Histogram = numpy.zeros(shape=(nintervals))

for item in Lista:
    #print item
    int_number = 1 + int((item-lmin)/width)
    print " "
    print "item,lmin= ",item,lmin
    print "(item-lmin)/width= ",(item-lmin)," / ",width," ====== ",(float(item)-float(lmin))/float(width)
    print "int((item-lmin)/width)= ",int((item-lmin)/width) 
    print item , " belongs to interval ", int_number, " which is from ", lmin+width*(int_number-1), " to ",lmin+width*int_number
    Histogram[int_number] = Histogram[int_number] + 1

4

但不知怎的,我完全迷失了,我得到了奇怪的错误,任何人都可以帮忙

由于

Pd积。这些是输出的结果:

item,lmin=  1.0 1.0
(item-lmin)/width=  0.0  /  0.666666666667  ======  0.0
int((item-lmin)/width)=  0
1.0  belongs to interval  1  which is from  1.0  to  1.66666666667

item,lmin=  2.0 1.0
(item-lmin)/width=  1.0  /  0.666666666667  ======  1.5
int((item-lmin)/width)=  1
2.0  belongs to interval  2  which is from  1.66666666667  to  2.33333333333

item,lmin=  3.0 1.0
(item-lmin)/width=  2.0  /  0.666666666667  ======  3.0
int((item-lmin)/width)=  3
3.0  belongs to interval  4  which is from  3.0  to  3.66666666667
Traceback (most recent call last):
  File "from_list_to_histogram.py", line 43, in <module>
    Histogram[int_number] = Histogram[int_number] + 1
IndexError: index out of bounds

最重要的错误是:

(item-lmin)/ width = 1.0 / 0.666666666667 ====== 1.5

IndexError:索引越界

4 个答案:

答案 0 :(得分:1)

我认为这个问题可能是一个特殊的错误:

int_number = 1 + int((item-lmin)/width)

为什么1 +?长度为N的数组上的Python索引包括0到N-1。这里1 +使int_number从1变为1 + (lmax-lmin)/width,即1 + nintervals给定width的公式,同时将Histogram的大小调整为nintervals } items - 所以它实际上是一个二分之一,被1 +恶化了,但即使没有它也会存在(仅适用于lmax)。使间隔更宽,所以lmax落在最后一个而不仅仅是超出它,并且失去1 +,事情可能会更好。

答案 1 :(得分:1)

这是一种更多的Pythonic方法。

from itertools import groupby
from math import floor

data = [1,3,2,3,4,5,3.2,4,2,2,3.6]
data.sort()

nintervals = 3
lmax = max(data)
lmin = min(data)

width = 1.0*(lmax-lmin)/nintervals

def grouper(item):    
    return floor(1.0*(item-lmin)/width)

for i, b in groupby(data, grouper):
    print '%.3f <= i < %.3f ' %(lmin + i * width, lmin + (i+1) * width), list(b)

答案 2 :(得分:0)

在最后一行,您访问索引太大的直方图。你应该确保'int_number'最多是

len(Histogram) - 1

可能存在一个导致此问题的错误。

答案 3 :(得分:0)

我刚删除了从文件加载的代码并重写为更易读的内容

from math import floor

Lista = [1,3,2,3,4,5,3.2,4,2,2]
ic=0
nintervals= 3

lmin=min(Lista)
print "min= ",lmin
lmax=max(Lista)
print "max= ",lmax

width=1.0*(lmax-lmin)/nintervals
print "width= ",width

nelements=len(Lista)
print "nelements= ",nelements
print " "
histogram =[0]*nintervals

for item in Lista:
    ind = int(floor(1.0*(item-lmin)/width))
    if ind==nintervals:
        ind=ind-1
    histogram[ind]+=1

for i,v in enumerate(histogram):
    print "from", lmin+i*width, "to", lmin+(i+1)*width, "are",v,"values"

for i,v in enumerate(histogram):
    print "Visual presentation:","="*int(round(v*40.0/lmax))