如何避免Python中的错误“TypeError:einsum的无效数据类型”

时间:2015-06-18 18:09:00

标签: python python-2.7 numpy pandas machine-learning

我尝试将CS​​V文件加载到numpy-array并在LogisticRegression等中使用该数组。现在,我正在努力解决错误如下所示:

import numpy as np
import pandas as pd

from sklearn import preprocessing
from sklearn.linear_model import LogisticRegression    

dataset =  pd.read_csv('../Bookie_test.csv').values
X = dataset[1:, 32:34]
y = dataset[1:, 14]

# normalize the data attributes
normalized_X = preprocessing.normalize(X)
# standardize the data attributes
standardized_X = preprocessing.scale(X)

model = LogisticRegression()
model.fit(X, y)
print(model)
# make predictions
expected = y
predicted = model.predict(X)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))

我收到了一个错误:

> C:\Anaconda32\lib\site-packages\sklearn\utils\validation.py:332:
> UserWarning: The normalize function assumes floating point values as
> input, got object   "got %s" % (estimator, X.dtype)) Traceback (most
> recent call last):   File
> "X:/test3.py", line 23, in
> <module>
>     normalized_X = preprocessing.normalize(X)   File "C:\Anaconda32\lib\site-packages\sklearn\preprocessing\data.py", line
> 553, in normalize
>     norms = row_norms(X)   File "C:\Anaconda32\lib\site-packages\sklearn\utils\extmath.py", line 65,
> in row_norms
>     norms = np.einsum('ij,ij->i', X, X) TypeError: invalid data type for einsum
  

我是Python新手,不喜欢转换:

  1. 将CSV加载到Pandas
  2. 将Pandas转换为NumPy
  3. 在LogisticRegression中使用NumPy
  4. 有没有简单的方法,例如:

    1. 加载到Pandas
    2. 在ML方法中使用Pandas Dataframe?

1 个答案:

答案 0 :(得分:0)

关于主要问题,感谢Evert建议,我会检查。

关于#2:我找到了很棒的教程http://www.markhneedham.com/blog/2013/11/09/python-making-scikit-learn-and-pandas-play-nice/

并使用pandas + sklearn

取得了预期的结果