我正在尝试运行逻辑回归模型。我的变量是:
X_train
- pandas df的大小(13,16110)和y_train
- 长度列表(16110)
运行下面的逻辑回归模型时,收到以下错误:
ValueError: Found input variables with inconsistent numbers of samples: [13, 16110]
由于我的数据长度相同,我不明白为什么会发生这样的错误。应该是这种情况我应该使用熊猫系列长度(,16110)?
编辑:我意识到这可能是一个重复的问题,但其他人似乎没有解决数据似乎长度相同的确切问题
def logisticRegression(X_train, y_train):
logistic = linear_model.LogisticRegression()
pipeline = Pipeline( [('scl', StandardScaler()),('pca', PCA(n_components=2)), ('clf', logistic)] )
param_grid= [dict( pca__n_components=[None, 2, 3, 4, 5], clf__C = [0.001,0.01,0.1,1,10,100,1000] )]
grid_search= GridSearchCV(pipeline, param_grid = param_grid)
grid_search.fit(X_train, y_train)
return(grid_search.best_estimator_, grid_search.best_score_)