我正在尝试在python中实现随机森林分类器,但它显示值错误。示例代码为:
from sklearn.ensemble import RandomForestClassifier
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
df = pd.read_csv("0.5-1.csv")
df.head()
X = df[['wavelength', 'phase velocity']]
y = df['shear wave velocity']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print (len(X_train),len(X_test),len(y_train),len(y_test))
rf = RandomForestClassifier(n_estimators=40)
rf.fit(X_train, y_train)
print (rf.score(X_test, y_test))
错误是:
Traceback (most recent call last):
File "G:\My Drive\ANN\test\0.5-1\0.5-1_tunecode.py", line 23, in <module>
rf.fit(X_train, y_train)
File "C:\Users\sadia\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\ensemble\forest.py", line 275, in fit
y, expanded_class_weight = self._validate_y_class_weight(y)
File "C:\Users\sadia\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\ensemble\forest.py", line 478, in _validate_y_class_weight
check_classification_targets(y)
File "C:\Users\sadia\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\utils\multiclass.py", line 169, in check_classification_targets
raise ValueError("Unknown label type: %r" % y_type)
ValueError: Unknown label type: 'continuous'
失败发生在:
rf.fit(X_train, y_train)
任何帮助将不胜感激。
答案 0 :(得分:0)
发生此错误是因为您将浮点值传递给分类器,该分类器期望将分类值作为目标向量。请尝试使用回归算法。 也就是说,应该使用RandomForestRegressor代替连续目标向量,而不是 RandomForestClassifier 。
希望这会有所帮助!