Question

我已经写了一个脚本来创建随机森林回归模型。

问题是我的精度和f1测量值都达到1.00，而我所做的更改却没有。更改模型类型，测试大小，数据集中包含的行和列，使其保持不变。

我怀疑我做错了什么。我想知道在什么情况下会发生这种情况。

当前结果：

Report:
               precision    recall  f1-score   support

           1       1.00      1.00      1.00         1

   micro avg       1.00      1.00      1.00         1
   macro avg       1.00      1.00      1.00         1
weighted avg       1.00      1.00      1.00         1

Accuracy:   1.0

脚本如下：

import pandas as pd  
import numpy as np  
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score
from sklearn import preprocessing

dataset = pd.read_csv("./Data/Assignment2DataSets/216037514.csv")  

print(len(dataset))

dataset["RainTomorrow"] = dataset["RainTomorrow"].astype('category')
dataset["RainTomorrow"] = dataset["RainTomorrow"].cat.codes

dataset.dropna(inplace=True)

dataset = pd.get_dummies(dataset, columns=["Date", "Location", "RainToday", "WindGustDir", "WindDir9am", "WindDir3pm"], prefix=["Date", "Loc", "RTod", "WGD", "WD9am", "WD3pd"])

X = dataset.drop('RainTomorrow', axis=1)

y = dataset['RainTomorrow'] # Line must stay the same

train_test_split(X,y,test_size=0.20, random_state=STUDENTNUMBER)
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.20, random_state=216037514)

classifier = RandomForestRegressor(n_estimators = 100, random_state = 216037514)

classifier.fit(X_train,y_train) # Line must stay the same

y_pred = classifier.predict(X_test) # Line must stay the same

print("Report:\n", classification_report(y_test,y_pred))
print("Accuracy:  ", accuracy_score(y_test,y_pred))

无论做出什么更改，准确性和F1得分均为100％

0 个答案: