什么是“用户警告:未选择功能”

时间:2019-08-01 08:32:59

标签: python pandas feature-selection user-warning

我正在使用LassoCV()模型进行特征选择。这给了我这个问题,并且没有选择任何功能。 “ C:\ Users \ xyz \ Anaconda3 \ lib \ site-packages \ sklearn \ feature_selection \ base.py:80:UserWarning:未选择功能:数据太嘈杂或选择测试太严格。   UserWarning)”

下面给出了代码。

数据在https://www.kaggle.com/jtrofe/beer-recipes/downloads/recipeData.csv/3

import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder
from sklearn.feature_selection import SelectFromModel
from sklearn.linear_model import LassoCV

# dataset URL = https://www.kaggle.com/jtrofe/beer-recipes/downloads/recipeData.csv/3
dataframe = pd.read_csv('Brewer Friend Beer Recipes.csv', encoding = 'latin')
# Encoding the non numerical columns
def encoding_data(dataframe):
    if(dataframe.dtype == 'object'):
        return LabelEncoder().fit_transform(dataframe.astype(str))
    else:
        return dataframe
# Feature Selection using the selected Target Feature
def feature_selection(raw_dataframe, target_feature_list):
    output_list = []
    # preprocessing Converting Categorical data into Numeric Data
    dataframe = raw_dataframe.apply(encoding_data)
    column_list = dataframe.columns.tolist()
    dataframe = dataframe.dropna()
    for target in target_feature_list:
        target_feature = target
        x = dataframe.drop(columns=[target_feature])
        y = dataframe[target_feature].values
        # Lasso feature selection 
        estimator = LassoCV(cv = 3, n_alphas = 1)
        featureselection = SelectFromModel(estimator)
        featureselection.fit(x,y)
        features = featureselection.transform(x)
        feature_list = x.columns[featureselection.get_support()]
        features = ''
        features = ', '.join(feature_list)
        l = (target,features)
        output_list.append(l)
    output_df = pd.DataFrame(output_list,columns = ['Name','Selected Features'])
    print('\nThe Feature Selection is done with the respective target feature(s)')
    return output_df
print(feature_selection(dataframe, ['BrewMethod']))

我收到此警告,并且未选择任何功能。

"C:\Users\xyz\Anaconda3\lib\site-packages\sklearn\feature_selection\base.py:80: UserWarning: No features were selected: either the data is too noisy or the selection test too strict. UserWarning)"

任何想法如何纠正这一点?

1 个答案:

答案 0 :(得分:0)

如果未选择任何功能,则可以逐渐减小lambda(或在scikit中为alpha)。这将减少惩罚并可能返回一些非零系数。

没有选择任何系数是非常不寻常的。您应该考虑检查数据中的相关性。也许您有很多共线性。