要对python进行逻辑回归,这是我的代码:
导入的数据集:Facebook Metrics
# Load dataset
url = "dataset_Facebook.csv"
dataset1 = pandas.read_csv(url, sep = ";", header = 0)
# Split-out validation dataset
array = dataset1.values
X = array[:,0:4]
Y = array[:,4]
validation_size = 0.20
seed = 7
X_train, X_validation, Y_train, Y_validation = model_selection.train_test_split(X, Y, test_size=validation_size, random_state=seed)
# Test options and evaluation metric
seed = 7
scoring = 'accuracy'
# Spot Check Algorithms
models = []
models.append(('LR', LogisticRegression()))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier()))
models.append(('NB', GaussianNB()))
models.append(('SVM', SVC()))
# evaluate each model in turn
results = []
names = []
for name, model in models:
kfold = model_selection.KFold(n_splits=10, random_state=seed)
cv_results = np.log10(model_selection.cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring))
results.append(cv_results)
names.append(name)
msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())
print(msg)
编译程序时,我得到以下错误:
Traceback (most recent call last):
File "/Users/ernestsoo/Desktop/WESTWORLD (Season 01) DUB 720/Assignment2.JackyTen.ErnestSoo/assignment2.py", line 93, in <module>
cv_results = np.log10(model_selection.cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring))
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/model_selection/_validation.py", line 140, in cross_val_score
for train, test in cv_iter)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 758, in __call__
while self.dispatch_one_batch(iterator):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 608, in dispatch_one_batch
self._dispatch(tasks)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 571, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 109, in apply_async
result = ImmediateResult(func)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 326, in __init__
self.results = batch()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 131, in __call__
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 131, in <listcomp>
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/model_selection/_validation.py", line 238, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/linear_model/logistic.py", line 1173, in fit
order="C")
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/utils/validation.py", line 521, in check_X_y
ensure_min_features, warn_on_dtype, estimator)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/sklearn/utils/validation.py", line 382, in check_array
array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: could not convert string to float: 'Status'
因为它看起来像是一个DataType问题,所以我尝试将数据集中的值解析为float:
array = float(dataset1.values)
但这不起作用。
我该如何解决这个问题?
答案 0 :(得分:2)
ValueError: could not convert string to float: 'Status'
此错误表示在某些时候您的代码正在尝试将字符串“Status”转换为float 。将数据转换为float将无法解决问题。问题是你的代码试图抛出不应该的东西。
如果您执行该代码:float("Hello")
它会引发ValueError: could not convert string to float: 'Hello'
。使用错误信息调试代码。尝试找到“Status”字符串在预期浮点数的位置。
希望它能帮助您调试代码