我有一个在控制台上打印时看起来像这样的python数据框(称为df):
public class DiscountProduct extends Product {
private double discountRate;
public DiscountProduct(String productCode, String description, int unitPrice, double discountRate) {
super(productCode, description, unitPrice);
this.discountRate = discountRate;
}
//equals method to test discount product for equality.
public boolean equals(Object obj){
Order other = (Order) obj;
//how to test for the equality for the discountProduct's field discountRate?
//does I need to add some method in Order class to get the discountRate of object of Order class,
//because relation is 'Order has Product' and then there is a parent-child relation
//between product and DiscountProduct class.
}
}
我正在通过机器学习器运行它,但是在尝试打印预测值时出现错误。这是我的代码:
date 2019-09-03 00:00:00 ... OverallAtt
students ...
5c48943cbe8e95292564e163 0.0 ... 78.321678
5c48943dbe8e95292564e165 100.0 ... 87.500000
5c48943dbe8e95292564e166 100.0 ... 86.713287
5c48943dbe8e95292564e167 100.0 ... 95.804196
5c48943dbe8e95292564e169 100.0 ... 100.000000
5c48943dbe8e95292564e16b 100.0 ... 98.601399
5c48943dbe8e95292564e16d 100.0 ... 85.314685
5c48943dbe8e95292564e173 100.0 ... 96.503497
5c48943dbe8e95292564e175 100.0 ... 83.216783
我收到此错误:
dataset = df
X = dataset
X = X.drop(['OverallAtt'], axis=1)
X = pd.DataFrame(X).fillna(0)
y = dataset['OverallAtt'] #Total Attendance ThisYear
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)
import pickle
filename='Regressor_model.sav'
pickle.dump(regressor, open(filename, 'wb'))
load_lr_model =pickle.load(open(filename, 'rb'))
#PREDICT FROM NEW DATA
dataset = df
X = dataset
X = X.drop(['OverallAtt'], axis=1)
X = pd.DataFrame(X).fillna(0)
ActualAttendance = dataset['OverallAtt']
Names = df.reset_index(drop=False)['students']
NewX_test = (X)
y_load_predit=load_lr_model.predict(NewX_test)
Newdf = pd.DataFrame({'Full Name': Names, 'Actual Attendance': ActualAttendance, 'Predicted Attendance': y_load_predit})
print(Newdf)
ActualAttendance和Names均为382。Y_load_predit也是382的数组。所以不确定我为什么会收到此错误?