由于元组索引超出范围

时间:2015-08-05 12:11:14

标签: numpy scikit-learn theano

我想让这个classifier正常工作。它是scikit学习的延伸,依赖于Theano。

我的目标是让神经网络符合年份列表并教会它知道它是否是闰年(之后我会增加范围)。但如果我想测试这个例子,我会遇到错误。

我的代码如下所示:

leapyear.py

import numpy as np
import calendar

from sknn.mlp import Classifier, Layer
from sklearn.cross_validation import train_test_split

# create years in range
years = np.arange(1970, 2001)
pre_is_leap = []

# test if year is a leapyear
for x in years:
    pre_is_leap.append(calendar.isleap(x))

# convert true, false list to 0,1 list
is_leap = np.array(pre_is_leap, dtype=bool).astype(int)

# split
years_train, years_test, is_leap_train, is_leap_test = train_test_split(years, is_leap, test_size=0.33, random_state=42)

# test output
print(len(years_train))
print(len(is_leap_train))
print(years_train)
print(is_leap_train)

#neural network
nn = Classifier(
    layers=[
        Layer("Maxout", units=100, pieces=2),
        Layer("Softmax")],
    learning_rate=0.001,
    n_iter=25)

# fit

nn.fit(years_train, is_leap_train)
#nn.fit(np.array(years_train), np.array(is_leap_train))

requirements.txt

numpy==1.9.2
PyYAML==3.11
scikit-learn==0.16.1
scikit-neuralnetwork==0.3
scipy==0.16.0
Theano==0.7.0

我的输出错误:

20
20
[1986 1975 1983 1981 1992 1971 1972 1995 1973 1991 1996 1988 2000 1990 1977
 1980 1984 1998 1989 1976]
[0 0 0 0 1 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1]
/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/utils/validation.py:498: UserWarning: MinMaxScaler assumes floating point values as input, got int64
  "got %s" % (estimator, X.dtype))
/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/preprocessing/data.py:256: DeprecationWarning: Implicitly casting between incompatible kinds. In a future numpy release, this will raise an error. Use casting="unsafe" if this is intentional.
  X *= self.scale_
/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/preprocessing/data.py:257: DeprecationWarning: Implicitly casting between incompatible kinds. In a future numpy release, this will raise an error. Use casting="unsafe" if this is intentional.
  X += self.min_
Traceback (most recent call last):
  File "/home/devnull/master/scikit/leapyear.py", line 47, in <module>
    pipeline.fit(years_train, is_leap_train)
  File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sklearn/pipeline.py", line 141, in fit
    self.steps[-1][-1].fit(Xt, y, **fit_params)
  File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 283, in fit
    return super(Classifier, self)._fit(X, yp)
  File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 127, in _fit
    X, y = self._initialize(X, y)
  File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 37, in _initialize
    self._create_specs(X, y)
  File "/home/devnull/master/scikit/env/lib/python3.4/site-packages/sknn/mlp.py", line 67, in _create_specs
    self.unit_counts = [numpy.product(X.shape[1:]) if self.is_convolution else X.shape[1]]
IndexError: tuple index out of range

我查看了mlp.py的来源,但我不知道如何修复它。什么必须改变,我可以适应我的网络?

更新没有相关问题: 我只是想补充一点,我需要将年份转换为二进制表示,之后神经网络将起作用。

1 个答案:

答案 0 :(得分:1)

问题是分类器要求数据呈现为2维numpy数组,第一个轴是样本,第二个轴是特征。

在您的情况下,您只有一个&#34;功能&#34; (年)因此您需要将年份数据转换为Nx1 2D numpy数组。这可以通过在数据拆分语句之前添加以下行来实现:

years = np.array([[year] for year in years])