运行随机森林分类器时出错

时间:2019-10-04 04:47:44

标签: python scikit-learn random-forest

我正在尝试在python中实现随机森林分类器,但它显示值错误。示例代码为:

from sklearn.ensemble import RandomForestClassifier
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split

df = pd.read_csv("0.5-1.csv")
df.head()

X = df[['wavelength', 'phase velocity']]
y = df['shear wave velocity']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

print (len(X_train),len(X_test),len(y_train),len(y_test))

rf = RandomForestClassifier(n_estimators=40)
rf.fit(X_train, y_train)
print (rf.score(X_test, y_test))

错误是:

Traceback (most recent call last):
  File "G:\My Drive\ANN\test\0.5-1\0.5-1_tunecode.py", line 23, in <module>
    rf.fit(X_train, y_train)
  File "C:\Users\sadia\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\ensemble\forest.py", line 275, in fit
    y, expanded_class_weight = self._validate_y_class_weight(y)
  File "C:\Users\sadia\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\ensemble\forest.py", line 478, in _validate_y_class_weight
    check_classification_targets(y)
  File "C:\Users\sadia\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\utils\multiclass.py", line 169, in check_classification_targets
    raise ValueError("Unknown label type: %r" % y_type)
ValueError: Unknown label type: 'continuous'

失败发生在:

rf.fit(X_train, y_train)

任何帮助将不胜感激。

Here is my sample data:

1 个答案:

答案 0 :(得分:0)

发生此错误是因为您将浮点值传递给分类器,该分类器期望将分类值作为目标向量。请尝试使用回归算法。 也就是说,应该使用RandomForestRegressor代替连续目标向量,而不是 RandomForestClassifier

希望这会有所帮助!