如何使用sklearn管道从朴素贝叶斯中找到热门功能

时间:2018-11-11 20:17:52

标签: scikit-learn pipeline feature-extraction naivebayes

如何使用sklearn管道从朴素贝叶斯中找到主要功能

大家好,

我正在尝试使用管道来应用Naive Bayes(MultinomialNB),我想出了代码。但是,我有兴趣找到十个最积极的词和否定词,但未能成功。当我进行搜索时,我得到了查找下面提到的主要功能的代码。但是,当我尝试使用管道使用代码时,出现了下面提到的错误。我尝试穷举搜索,但是没有使用管道就得到了代码。但是当我将代码与管道的输出一起使用时,它无法正常工作。您能帮助我如何从管道输出中找到功能的重要性吗?

    # Pipeline dictionary
    pipelines = {
        'bow_MultinomialNB' : make_pipeline(
                                        CountVectorizer(), 
                                        preprocessing.Normalizer(),
                                        MultinomialNB()
                                   )
    }


    # List tuneable hyperparameters of our  pipeline
    pipelines['bow_MultinomialNB'].get_params()


    # BOW -  MultinomialNB hyperparameters
    bow_MultinomialNB_hyperparameters = {
        'multinomialnb__alpha' : [1000,500,100,50,10,5,1,0.5,0.1,0.05,0.01,0.005,0.001,0.0005,0.0001]
    }

    # Create hyperparameters dictionary
    hyperparameters = {
        'bow_MultinomialNB' : bow_MultinomialNB_hyperparameters
    }


    tscv = TimeSeriesSplit(n_splits=3) #For time based splitting
    for name, pipeline in pipelines.items():
        print("NAME:",name)
        print("PIPELINE:",pipeline)


        %time
    # Create empty dictionary called fitted_models
    fitted_models = {}

    # Loop through model pipelines, tuning each one and saving it to fitted_models
    for name, pipeline in pipelines.items():
        # Create cross-validation object from pipeline and hyperparameters

        model = GridSearchCV(pipeline, hyperparameters[name], cv=tscv, n_jobs=1,verbose=1)


        # Fit model on X_train, y_train

        model.fit(X_train, y_train)


        # Store model in fitted_models[name] 

        fitted_models[name] = model


        # Print '{name} has been fitted'
        print(name, 'has been fitted.')

重要提示:-

        pipelines['bow_MultinomialNB'].steps[2][1].classes__

        ---------------------------------------------------------------------------
        AttributeError                            Traceback (most recent call last)
        <ipython-input-125-7d45b007e86b> in <module>()
        ----> 1 pipelines['bow_MultinomialNB'].steps[2][1].classes_

        AttributeError: 'MultinomialNB' object has no attribute 'classes_'


        pipelines['bow_MultinomialNB'].steps[0][1].get_feature_names()
        ---------------------------------------------------------------------------
        NotFittedError                            Traceback (most recent call last)
        <ipython-input-126-2883929221d1> in <module>()
        ----> 1 pipelines['bow_MultinomialNB'].steps[0][1].get_feature_names()

        ~\Anaconda3\lib\site-packages\sklearn\feature_extraction\text.py in get_feature_names(self)
            958     def get_feature_names(self):
            959         """Array mapping from feature integer indices to feature name"""
        --> 960         self._check_vocabulary()
            961 
            962         return [t for t, i in sorted(six.iteritems(self.vocabulary_),

        ~\Anaconda3\lib\site-packages\sklearn\feature_extraction\text.py in _check_vocabulary(self)
            301         """Check if vocabulary is empty or missing (not fit-ed)"""
            302         msg = "%(name)s - Vocabulary wasn't fitted."
        --> 303         check_is_fitted(self, 'vocabulary_', msg=msg),
            304 
            305         if len(self.vocabulary_) == 0:

        ~\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_is_fitted(estimator, attributes, msg, all_or_any)
            766 
            767     if not all_or_any([hasattr(estimator, attr) for attr in attributes]):
        --> 768         raise NotFittedError(msg % {'name': type(estimator).__name__})
            769 
            770 

        NotFittedError: CountVectorizer - Vocabulary wasn't fitted.


        x=pipelines['bow_MultinomialNB'].steps[0][1]._validate_vocabulary()
        x.get_feature_names()

        ---------------------------------------------------------------------------
        AttributeError                            Traceback (most recent call last)
        <ipython-input-120-f620c754a34e> in <module>()
        ----> 1 x.get_feature_names()

        AttributeError: 'NoneType' object has no attribute 'get_feature_names'

关于, 什里(Shree)

0 个答案:

没有答案