ValueError:未知标签类型:'未知'

时间:2017-07-27 09:23:47

标签: python pandas numpy scikit-learn logistic-regression

我尝试运行以下代码。顺便说一句,我是python和sklearn的新手。

import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression


# data import and preparation
trainData = pd.read_csv('train.csv')
train = trainData.values
testData = pd.read_csv('test.csv')
test = testData.values
X = np.c_[train[:, 0], train[:, 2], train[:, 6:7],  train[:, 9]]
X = np.nan_to_num(X)
y = train[:, 1]
Xtest = np.c_[test[:, 0:1], test[:, 5:6],  test[:, 8]]
Xtest = np.nan_to_num(Xtest)


# model
lr = LogisticRegression()
lr.fit(X, y)

其中y是0和1的np.ndarray

我收到以下内容:

  

文件“C:\ Anaconda3 \ lib \ site-packages \ sklearn \ linear_model \ logistic.py”,line> 1174,in fit      check_classification_targets(y)的

     

check_classification_targets中的文件“C:\ Anaconda3 \ lib \ site-packages \ sklearn \ utils \ multiclass.py”,第172行,>      引发ValueError(“未知标签类型:%r”%y_type)

     

ValueError:未知标签类型:'未知'

来自sklearn文档的

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression.fit

y:类似数组,形状(n_samples,) 目标值(分类中的类别标签,回归中的实数)

我的错误是什么?

UPD:

y是数组([0.0,1.0,1.0,...,0.0,1.0,0.0],dtype = object)大小是(891,)

2 个答案:

答案 0 :(得分:53)

您的[WebMethod(Description = "This method is used to validate email")] public bool ValidateEmail(string email) { bool isValid = false; try { string[] host = (email.Split('@')); string hostname = host[1]; IPHostEntry IPhost = Dns.GetHostByName(hostname); IPEndPoint endPt = new IPEndPoint(IPhost.AddressList[0], 25); Socket soc = new Socket(endPt.AddressFamily, SocketType.Stream, ProtocolType.Tcp); soc.Connect(endPt); //open connection to host soc.Close(); isValid = true; } catch (Exception ex) { //ex.Message.ToString(); isValid = false; } return isValid = true; } 类型为y,因此sklearn无法识别其类型。在第object行后面添加第y=y.astype('int')行。

答案 1 :(得分:0)

除了Miriam,我也遇到类似的错误,但在我的情况下,y_pred的各个元素的类型为'np.int32',而y的各个元素的类型为'int'。 我通过以下方法解决了这个问题:

for i,x in enumerate(y_pred):
    y_pred[i]=x.astype('int')