为什么在创建决策树分类器时适合期间出现此错误

时间:2016-05-20 06:56:53

标签: pandas machine-learning classification decision-tree

您好我正在尝试使用此视频Hello World - 机器学习食谱#1 Google开发人员进行决策树分类。

这是我的代码。

#Import the Pandas library
  import pandas as pd
#Load the train and test datasets to create two DataFrames
train_url = "http://s3.amazonaws.com/assets.datacamp.com/course/Kaggle/train.csv" train = pd.read_csv(train_url)
#Print the head of the train and test dataframes
train.head() 
test_url = "http://s3.amazonaws.com/assets.datacamp.com/course/Kaggle/test.csv" test = pd.read_csv(test_url)
#Print the head of the train and test dataframes
test.head()
#from sklearn import tree
from sklearn import tree
#find the best feature to predict Survival rate
#define X_features and Y_labels
col_names=['Pclass','Age','SibSp','Parch']
X_features= train[col_names]
#assign survial to label
Y_labels= train.Survived
#create a decision tree classifier
clf=tree.DecisionTreeClassifier()
#fit (find patterns in Data)
clf=clf.fit(X_features, Y_labels)
clf.predict(test[col_names])

获取错误

  

ValueError Traceback(最近一次调用last)in()13#Y_train_sparse = Y_labels.to_sparse()14#fit(在数据中查找模式)---> 15 clf = clf.fit(X_features,Y_labels)16#clf.predict(test [col_names])

     

C:\ Users \用户nitinahu \应用程序数据\本地\连续\ Anaconda3 \ lib中\站点包\ sklearn \树\ tree.py   in fit(self,X,y,sample_weight,check_input,X_idx_sorted)152   random_state = check_random_state(self.random_state)153 if   check_input: - > 154 X = check_array(X,dtype = DTYPE,   accept_sparse =“csc”)155 if issparse(X):156 X.sort_indices()

     

C:\用户\ nitinahu \应用程序数据\本地\连续\ Anaconda3 \ lib中\站点包\ sklearn \ utils的\ validation.py   在check_array(array,accept_sparse,dtype,order,copy,   force_all_finite,ensure_2d,allow_nd,ensure_min_samples,   ensure_min_features,warn_on_dtype,estimator)396%(array.ndim,   estimator_name))397 if force_all_finite: - > 398   _assert_all_finite(array)399 400 shape_repr = _shape_repr(array.shape)

     

C:\用户\ nitinahu \应用程序数据\本地\连续\ Anaconda3 \ lib中\站点包\ sklearn \ utils的\ validation.py   在_assert_all_finite(X)52而不是np.isfinite(X).all()):53提高   ValueError(“输入包含NaN,无穷大”--->> 54“或值   大于%r。“%X.dtype)55 56

     

ValueError:输入包含NaN,无穷大或太大的值   D型( 'FLOAT32')。

1 个答案:

答案 0 :(得分:0)

只需检查您在回复中获得的所有值。

一个或两个是超出限定值,这会导致溢出。