Python中的Ordinal Ridge和Lasso回归

时间:2018-07-08 19:01:19

标签: python python-3.x python-2.7 statistics regression

感谢您抽出宝贵时间阅读我的问题。
我必须在我的数据集上运行Ordinal Ridge和Lasso回归。我要预测的值是序数(5个级别),并且我有许多连续的预测器(超过60个),但并不是所有逻辑上都有意义。因此,我想使用Lasso和Ridge运行Ordinal回归以找到重要的回归。 我是python的新手,我真的不知道该怎么做,并且感谢社区的任何帮助。 我找到了mord模块(即使我正确使用了它),也没有提供Ordinal Lasso。 有人可以帮我吗? 预先感谢。

更新: 我编写了以下代码,没有出现任何错误,并且准确性低于以前的分析。因此,我想我在做某件事上犯了一个错误。如果有人帮助我,我将不胜感激。我想可能是在扩展,但我不知道如何。 “ rel”具有五个值:1、2、3、4、5,这是我的预测值。

import numpy as np
import pandas as pd
import mord
from sklearn.preprocessing import scale, StandardScaler
from sklearn.metrics import mean_squared_error
import csv

#defining a function to rotate numbers in an array 
def leftRotatebyOne(arr, n):
    temp = arr[0]
    for i in range(n-1):
        arr[i] = arr[i+1]
    arr[n-1] = temp

#defining OR to do Ordinal Ridge Regression    
OR = mord.OrdinalRidge()

#definign the loop to go through all participants
for s in range(17):

    #reading the data for each participant
    df = pd.read_csv("Complete{0}.csv".format(s+1), index_col=0, header=None).dropna()
    df.index.name = 'subject{0}'.format(s+1)
    df.columns = ["ch{0}".format(i+1) for i in range(64)] +["irrel", "rel"]
    #defining output and predictors
    y = df.rel
    X = df.drop(['rel', 'irrel'], axis=1).astype('float64')

    #an array containig trial numbers
    T = np.array(range(480))

    #defining a matrix to hold the models of all runs(480 one-leave_out) for each participants
    out=np.empty((67,480))

    #runing the model for all trials (each time keeping one out)
    for t in range(480):

        T1 = T[:479]
        T2 = T[479:]   #the last one which is going to be out

        ## Always the last one is going to be out, how it works is that we rotate T, so the last trail changes

        #train samples
        X_train = X.iloc[T1,:]
        y_train = np.array(y.iloc[T1])

        scaler = StandardScaler().fit(X_train)

        #test sample
        X_test = X.iloc[T2,:]
        y_test = np.array(y.iloc[T2])

        #rotating T
        leftRotatebyOne(T,480)

        #runing ordinal ridge regression from the module mord
        OR.fit(scaler.transform(X_train), y_train)
        predicted = OR.predict(scaler.transform(X_test))
        error = mean_squared_error(y_test, predicted)
        coeff = pd.Series(OR.coef_, index=X.columns)

        #getting the accuracy of each prediction
        if predicted == y_test:
            accuracy = 1
        else:
            accuracy = 0

        #having all results in a matrix (each column is for leaving out one of the trials)
        out[:,t]=np.hstack((coeff,predicted,error, accuracy))

    #saving the results for each participant 
    np.savetxt("reg{0}.csv".format(s+1), out, delimiter=',')

 #saving all results in one file
filenames = ["reg{0}.csv".format(i+1) for i in range(17)]
dataframes = [pd.read_csv(p) for p in filenames]
merged_dataframe = pd.concat(dataframes, axis=1)
merged_dataframe.to_csv("merged.csv", index=False)

#reading the file that contains all the models for all the 
participants
cl = pd.read_csv("merged.csv", header=None).dropna()

#naming the rows
cl.index = ["ch{0}".format(i+1) for i in range(64)]["predicted","error","accuracy"]

#calculating the mean of each row
print(pd.Series.mean(cl, axis=1))

#getting teh mean of accuracy for each participant
for s in range(17):
    regg = pd.read_csv("reg{0}.csv".format(s+1), header=None).dropna()
    regg.index = ["ch{0}".format(i+1) for i in range(64)]["predicted","error","accuracy"]

    print(pd.Series.mean(regg, axis=1)[66])

除了mord模块之外,我什么都没找到。 我想做一个留一法的交叉验证,我只需要保留其中一个样本进行测试。

PS。 我正在按照此链接中的说明进行操作:
http://nbviewer.jupyter.org/github/JWarmenhoven/ISL-python/blob/master/Notebooks/Chapter%206.ipynb
  完全按照以下步骤操作会出现以下错误:
 模块'glmnet'没有属性'ElasticNet'

*但是,它们不包括序数回归。

1 个答案:

答案 0 :(得分:0)

您可以为此使用sklearn

from sklearn import linear_model

regr_lasso = linear_model.Lasso(alpha=0.1)

regr_ridge = linear_model.Ridge(alpha=1.0)

regr_elasticnet = linear_model.ElasticNet(random_state=0)

有关更多详细信息,请参见以下链接, http://scikit-learn.org/stable/auto_examples/linear_model/plot_lasso_coordinate_descent_path.html