Question

尝试切片数据时出现IndexError。数据是存储为文本文件的单词标记和形状的对象数组（10848135，）。在将阵列重塑为2D之前，我需要提供以下建议：

如何使用numpy查看数据（或文件）以确保该数组不是列表数组或大小可变的数组-手动查看数据文件是一项艰巨的任务-？< / p>
我如何将dtype对象数组转换为整数，因为函数to_categorical需要int dtype吗？
在切片数组时应注意哪些注意事项？

下面是引发索引错误的函数：

def encode_words(self, dataset):
    data = dataset.split('\n')
    newShape = 2, -1
    tokenizer = Tokenizer()
    tokenizer.fit_on_texts(data)
    sequences = tokenizer.texts_to_sequences(data)
    vocab_size = len(tokenizer.word_index) + 1
    sequences = array(sequences)
    #sequences = np.array2string(sequences)
    sequences  = np.reshape(sequences, newShape)
    #sequences = np.array2string(sequences)
    print(sequences.dtype)
    print(sequences.shape)
    X, y = sequences[:-1], sequences[-1]
    print(y.dtype)
    #y = np.array2string(y)
    y = to_categorical(y, num_classes=vocab_size)
    seq_length = X.shape[1]
    return X, y, vocab_size, seq_length, tokenizer

下面是错误消息：

Reloaded modules: WordEmbedding

object
(2, 104309)
object
Traceback (most recent call last):

File "<ipython-input-18-9db02c6b1f06>", line 1, in <module>
  runfile('/home/asifa/anaconda3/deep_learning_project/processor.py', wdir='/home/asifa/anaconda3/deep_learning_project')

File "/home/asifa/anaconda3/envs/researchProject/lib/python3.6/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
  execfile(filename, namespace)

File "/home/asifa/anaconda3/envs/researchProject/lib/python3.6/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
  exec(compile(f.read(), filename, 'exec'), namespace)

File "/home/asifa/anaconda3/deep_learning_project/processor.py", line 15, in <module>
  X,y,vocab_size,seq_length,tokenizer = emb.encode_words(seq_data)

File "/home/asifa/anaconda3/deep_learning_project/WordEmbedding.py", line 77, in encode_words
  y = to_categorical(y, num_classes=vocab_size)

File "/home/asifa/anaconda3/envs/researchProject/lib/python3.6/site-packages/keras/utils/np_utils.py", line 25, in to_categorical
  y = np.array(y, dtype='int')

ValueError: setting an array element with a sequence.

块状数组-重塑和切片

0 个答案: