Question

我正在尝试使用PYTHON计算具有RLE（运行长度编码）格式编号的文件区域。我的FOR和IF语句有问题。

我正在尝试计算包含值201的单元格区域。

我当前的错误：

Traceback (most recent call last):
  File "D:\2016-2017\Fall2016\SpatialDataStructures\Labs\Lab4\Area_Calc_from_RLE.py", line 68, in <module>
    if cellValuesAsIntList[i] == 201:
IndexError: string index out of range

以下是我正在使用的文件：

ncols         40
nrows         40
xllcorner     -2.3036649208516
yllcorner     -1.1518324594945
cellsize      25
NODATA_value  -9999
13, 0, 1, 201, 26, 0, 1, 3, 12, 0, 1, 201, 39, 0, 1, 201, 39, 0, 1, 201, 39, 0, 2, 201, 33, 0, 4, 501, 2, 0, 2, 201, 32, 0, 4, 501, 3, 0, 3, 201, 29, 0, 5, 501, 5, 0, 2, 201, 26, 0, 7, 501, 6, 0, 1, 201, 25, 0, 8, 501, 6, 0, 1, 201, 25, 0, 7, 501, 7, 0, 1, 201, 25, 0, 6, 501, 8, 0, 1, 201, 25, 0, 6, 501, 8, 0, 1, 201, 6, 0, 4, 102, 15, 0, 7, 501, 7, 0, 1, 201, 6, 0, 6, 102, 14, 0, 7, 501, 5, 0, 2, 201, 6, 0, 6, 102, 15, 0, 6, 501, 5, 0, 1, 201, 7, 0, 5, 102, 16, 0, 6, 501, 5, 0, 1, 201, 7, 0, 4, 102, 18, 0, 8, 501, 2, 0, 5, 201, 3, 0, 3, 102, 19, 0, 9, 501, 5, 0, 2, 201, 25, 0, 8, 501, 6, 0, 1, 201, 29, 0, 4, 501, 6, 0, 1, 201, 29, 0, 4, 501, 6, 0, 1, 201, 9, 0, 1, 501, 5, 102, 15, 0, 2, 501, 7, 0, 1, 201, 9, 0, 6, 102, 23, 0, 2, 201, 8, 0, 1, 501, 6, 102, 3, 0, 4, 202, 15, 0, 2, 201, 4, 0, 6, 501, 6, 102, 2, 0, 2, 202, 2, 0, 8, 202, 5, 0, 4, 201, 4, 0, 8, 501, 4, 102, 3, 0, 1, 202, 5, 0, 2, 202, 3, 0, 6, 202, 1, 201, 7, 0, 8, 501, 2, 102, 1, 501, 21, 0, 1, 201, 6, 0, 11, 501, 22, 0, 1, 201, 6, 0, 8, 501, 25, 0, 1, 201, 6, 0, 8, 501, 25, 0, 1, 201, 1, 101, 5, 0, 8, 501, 25, 0, 1, 201, 1, 101, 5, 0, 8, 501, 14, 0, 3, 101, 8, 0, 1, 201, 1, 101, 5, 0, 11, 501, 12, 0, 4, 101, 6, 0, 2, 201, 1, 101, 4, 0, 12, 501, 11, 0, 4, 101, 6, 0, 1, 101, 1, 201, 1, 101, 4, 0, 12, 501, 21, 0, 1, 101, 1, 201, 5, 0, 11, 501, 23, 0, 1, 201, 5, 0, 8, 501, 26, 0, 1, 201, 5, 0, 8, 501, 26, 0, 1, 201, 5, 0, 7, 501, 27, 0, 1, 201, 5, 0, 8, 501, 8, 0

这是我的代码：

fi = open(r'D:\2016-2017\Fall2016\SpatialDataStructures\Labs\Lab4\Data\AsRLE.txt','r')
fileLines = fi.readlines()
fi.close
#---------------------------------------------------------------------

#---------------------------------------------------------------------
#Populated variables required for code from fileLines variable  
cellValuesAsString = []
lineNum = 0
for line in fileLines: # for each line the FileLines List
  # get number of cols, ncols, i.e. the 1st line in the file
  if lineNum == 0:
    ncols = int(line[14:])
  # get number of rows, nrows, i.e. the 2nd line in the file
  elif lineNum == 1:
    nrows = int(line[14:])
  # get cell size, cellsize, i.e. the 5th line in the file
  elif lineNum == 4:
    cellsize = int(line[14:])
  # get cell values in RLE format as String, , i.e. the 7th line in the file
  elif lineNum == 6:
    cellValuesAsString = line
  lineNum = lineNum + 1

# removes spaces
cellValuesAsString = cellValuesAsString.replace(" ", "")

# convert string into a list of strings split by comma
cellValuesAsStringList = cellValuesAsString.split(',')

# convert strings to integers
cellValuesAsIntList = cellValuesAsStringList
for index, item in enumerate(cellValuesAsStringList):
    cellValuesAsIntList[index] = int(cellValuesAsStringList[index])
#---------------------------------------------------------------------

#THIS IS WHERE YOU WRITE YOUR CODE

#---------------------------------------------------------------------
cellCode = 201
codeArea = 0

area = 0
npixels = 0
i = 1

for cellValuesAsIntList in line:
  if cellValuesAsIntList[i] == 201:
    b = cellValuesAsIntList[i-1]
    npixels = npixels + b
  else:
    i=i+2




print npixles
print "Area: " + npixels * cellSize

Answer 1

这看起来很费劲。您的问题是您的索引用完了，这会引发IndexError。虽然你的情况要求每个项目都有一点索引感知，幸运的是你只需要轮询以前迭代的值。

对于这种情况，请尝试跟踪之前的值：

npixels = 0
cells = cellValuesAsIntList                         # for clarity

prev = None
for cell in cells:
    if cell == 201:
        npixels += prev                             # increment operator
    prev = cell

根据经验，避免变异索引和迭代。虽然这不是您的直接问题，但这种做法可能会导致许多副作用。如果可能，直接迭代序列并使用每个迭代项。避免通过其他方式改变指数，例如反向迭代，改变复制的序列或使用理解。

Answer 2

假设你有：

>>> rle=[2,5,1,8,6,9,3,30,6,22,2,12]

您可以将其转换为运行长度元组，如下所示：

>>> zip(*[iter(rle)]*2)
[(2, 5), (1, 8), (6, 9), (3, 30), (6, 22), (2, 12)]

然后您可以过滤包含某个目标值的值（即使您将2作为目标，我也会使用201，因为您的示例中没有201）：

>>> filter(lambda t: t[0]==2, zip(*[iter(rle)]*2))
[(2, 5), (2, 12)]

然后将其应用于您的文件：

with open(file_name) as f_in:
   for line in f_in:
       li_as_ints=map(int, line.split(","))
       fl=filter(lambda t: t[0]==201, zip(*[iter(li_as_ints)]*2))
       # produces all the values, expanded, with 121 as value...

请不要沿着这些方向做点什么：

fh=open(a_file_name)
lines=fh.readlines()
for line in lines:
   #process a line

您不必要地阅读整个文件。

取而代之的：

with open(a_file_name) as f_in:
   for line in f_in:
     # process a line

循环列表导致IndexError - Python

2 个答案: