晚上好,
我想在同一数据框架上重申子集化和线性回归。
#I get the unique codes of the articles
codes = np.unique(data["cod_id"])
#Split
X = data['price']
y = data["quantity"]
accuracy = []
for i in np.nditer(codes):
data = data.loc[df["cod_id"] == i]
#Arrange an if statement to avoid 0-element arrays, while splitting (80% train, 20% test)
if int(len(data)) <= 2:
X_train = X
y_train = y
# Test dataset
X_test = X
y_test = y
else:
t = 0.8
t = int(t*len(data))
#Split
t = int(t*len(data))
# Train dataset
X_train = X[:t]
y_train = y[:t]
# Test dataset
X_test = X[t:]
y_test = y[t:]
#Run the Algorithm
lr = linear_model.LinearRegression()
lr.fit(X_train, y_train)
predicted_test_tr = lr.predict(X_test)
pred_cost = (X_test["price"] * predicted_test_tr).sum()
real_cost = (X_test["price"] * y_test).sum()
delta = (pred_cost - owner_cost)/owner_cost
accuracy.append(delta)
但它会报告一个列表&#34;准确度&#34;,只要&#34;代码&#34;一个,但在每个位置具有相同的值
print(accuracy)
5.43234
5.43234
5.43234
...
如何解决此问题? 谢谢