Question

晚上好，

我想在同一数据框架上重申子集化和线性回归。

#I get the unique codes of the articles
codes = np.unique(data["cod_id"])

#Split
X = data['price']
y = data["quantity"]

accuracy = []
for i in np.nditer(codes):
    data = data.loc[df["cod_id"] == i]

#Arrange an if statement to avoid 0-element arrays, while splitting (80% train, 20% test)

    if int(len(data)) <= 2:

        X_train = X 
        y_train = y  

        # Test dataset 
        X_test = X 
        y_test = y 
    else:
        t = 0.8
        t = int(t*len(data)) 

        #Split     
        t = int(t*len(data)) 
        # Train dataset 
        X_train = X[:t] 
        y_train = y[:t]  

        # Test dataset 
        X_test = X[t:] 
        y_test = y[t:]

    #Run the Algorithm
    lr = linear_model.LinearRegression()
    lr.fit(X_train, y_train)

    predicted_test_tr = lr.predict(X_test)

    pred_cost = (X_test["price"] * predicted_test_tr).sum()
    real_cost = (X_test["price"] * y_test).sum()

    delta = (pred_cost - owner_cost)/owner_cost 
    accuracy.append(delta)

但它会报告一个列表＆＃34;准确度＆＃34;，只要＆＃34;代码＆＃34;一个，但在每个位置具有相同的值

print(accuracy)

5.43234
5.43234
5.43234
...

如何解决此问题？谢谢

用于循环和线性回归

0 个答案: