Question

我按照本教程进行情感分析： https://stackabuse.com/python-for-nlp-sentiment-analysis-with-scikit-learn/

但是我不是专业人士，所以我不了解每个步骤的详细信息。现在，我想使用本教程将其应用于新数据： https://stackabuse.com/scikit-learn-save-and-restore-models/

但要点

  score = pickle_model.score(Xtest, Ytest)

我收到“值”错误：无法从字符串转换为浮点“正”（正是前面进行的情感分析的标签）。令我惊讶的是，即使使用X_train和y_train（来自第一个教程），也会发生错误，但是

text_classifier.fit(X_train, y_train)

工作正常，没有任何错误。因此，我假设fit（）方法所做的事情是score（）方法做不到的，这会造成问题。但是，我不知道如何解决它。

这是完整的错误消息：

ValueError跟踪（最近一次通话最后一次）

<ipython-input-210-070f6faef44c> in <module>
     34 print(len(X_train))
     35 print(len(y_train))
---> 36 score = pickle_model.score(X_train, y_train)
     37 print("Test score: {0:.2f} %".format(100 * score))
     38 

~\Anaconda3\lib\site-packages\sklearn\base.py in score(self, X, y, sample_weight)
    408         y_pred = self.predict(X)
    409         # XXX: Remove the check in 0.23
--> 410         y_type, _, _, _ = _check_reg_targets(y, y_pred, None)
    411         if y_type == 'continuous-multioutput':
    412             warnings.warn("The default value of multioutput (not exposed in "

~\Anaconda3\lib\site-packages\sklearn\metrics\regression.py in _check_reg_targets(y_true, y_pred, multioutput)
     76     """
     77     check_consistent_length(y_true, y_pred)
---> 78     y_true = check_array(y_true, ensure_2d=False)
     79     y_pred = check_array(y_pred, ensure_2d=False)
     80 

~\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    494             try:
    495                 warnings.simplefilter('error', ComplexWarning)
--> 496                 array = np.asarray(array, dtype=dtype, order=order)
    497             except ComplexWarning:
    498                 raise ValueError("Complex data not supported\n"

~\Anaconda3\lib\site-packages\numpy\core\numeric.py in asarray(a, dtype, order)
    536 
    537     """
--> 538     return array(a, dtype, copy=False, order=order)
    539 
    540

ValueError：无法将字符串转换为float：“正”

这是错误发生的代码段：

vectorizer = TfidfVectorizer (max_features=2500, min_df=1, max_df=1, stop_words=stopwords.words('english'))
chat_data = vectorizer.fit_transform(chat_data).toarray()

X_train, X_test, y_train, y_test = train_test_split(chat_data, chat_labels, test_size=0.2, random_state=0)

text_classifier = RandomForestClassifier(n_estimators=200, random_state=0)
text_classifier.fit(X_train, y_train)
predictions = text_classifier.predict(X_test)
X_train = np.array(X_train).reshape((-1,1))
y_train = np.array(y_train).reshape((-1,1))
print(len(X_train))
print(len(y_train))
score = pickle_model.score(X_train, y_train)
print("Test score: {0:.2f} %".format(100 * score))

Python：无法从String转换为Float

0 个答案: