IndexError可以用作索引吗?

时间:2017-05-28 01:16:16

标签: python numpy scipy

我收到了一个错误, IndexError:只有整数,切片(:),省略号(...),numpy.newaxis(无)和整数或布尔数组才是有效索引。 我正在制作声音识别应用程序。 我的代码是

import numpy as np
import pandas as pd
import scipy as sp
import  pickle
from scipy import fft
from time import localtime, strftime
import matplotlib.pyplot as plt
from skimage.morphology import  disk,remove_small_objects
from skimage.filter import rank
from skimage.util import img_as_ubyte 
import wave

folder = 'mlsp_contest_dataset/'


essential_folder = folder+'essential_data/'
supplemental_folder = folder+'supplemental_data/'
spectro_folder =folder+'my_spectro/'
single_spectro_folder =folder+'my_spectro_single/'
dp_folder = folder+'DP/'

# Each audio file has a unique recording identifier ("rec_id"), ranging from 0 to 644. 
# The file rec_id2filename.txt indicates which wav file is associated with each rec_id.
rec2f = pd.read_csv(essential_folder + 'rec_id2filename.txt', sep = ',')

# There are 19 bird species in the dataset. species_list.txt gives each a number from 0 to 18. 
species = pd.read_csv(essential_folder + 'species_list.txt', sep = ',')
num_species = 19

# The dataset is split into training and test sets. 
# CVfolds_2.txt gives the fold for each rec_id. 0 is the training set, and 1 is the test set.
cv =  pd.read_csv(essential_folder + 'CVfolds_2.txt', sep = ',')

# This is your main label training data. For each rec_id, a set of species is listed. The format is:
# rec_id,[labels]
raw =  pd.read_csv(essential_folder + 'rec_labels_test_hidden.txt', sep = ';')
label = np.zeros(len(raw)*num_species)
label = label.reshape([len(raw),num_species])
for i in range(len(raw)):
    line = raw.iloc[i]
    labels = line[0].split(',')
    labels.pop(0) # rec_id == i
    for c in labels:
        if(c != '?'):
            print(label)
            label[i,c] = 1

我运行此代码, 我在这一点label[i,c] = 1得到了错误。 我试图通过label查看print(label)变量 label就像

warn(skimage_deprecation('The `skimage.filter` module has been renamed '
[[ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 ...,
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]]

我认为,错误意味着整数,切片(:),省略号(...),numpy.newaxis(无)和整数或布尔值不能用作数组索引,但我把int放在数组索引中很多次,所以我无法理解为什么会发生这种错误。 调试告诉我,

labels

有标签:: ['?']。

c

for c in labels[i]:
有''',我真的无法理解?我认为这个?导致错误,但我不知道如何解决这个问题。 我该如何解决这个问题?

1 个答案:

答案 0 :(得分:0)

错误消息是在索引numpy数组时说的

only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices 

label是一个2d浮点数组,当前值为0。:

label = np.zeros([len(raw),num_species])

在循环中:

for i in range(len(raw)):        # i=0,1,2,...

你检查过raw是什么样的吗?来自pd.read_csv我想象它是一个数据帧; iloc[i]选择一行,但尚未拆分成列?

    line = raw.iloc[i]
    labels = line[0].split(',')
    labels.pop(0) # rec_id == i

labels是什么样的?我猜它是任何字符串数组

    for c in labels:
        if(c != '?'):          # evidently `c` is a string
            print(label)       # prints the 2d array
            label[i,c] = 1

二维数组的索引应该类似于label[0,1]c可能是错误消息中的其他内容之一。但它不能是一个字符串。

数据帧允许使用字符串进行索引 - 这是一个pandas功能。但numpy数组必须有数字索引,或几个替代。它们没有使用字符串编制索引(结构化数组除外)。

In [209]: label = np.zeros((3,5))
In [210]: label
Out[210]: 
array([[ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.]])
In [211]: label[1,3]
Out[211]: 0.0
In [212]: label[1,3]=1      # index with integers OK
In [213]: label[0,2]=1
In [214]: label[0,'?'] =1    # index with a string - ERROR
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-214-3738f623c78e> in <module>()
----> 1 label[0,'?'] =1

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

In [215]: label[0,:] =2     # index with a slice
In [216]: label
Out[216]: 
array([[ 2.,  2.,  2.,  2.,  2.],
       [ 0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.]])