TruncatedSVD get_params([deep])到底做了什么?

时间:2018-01-08 09:51:07

标签: scikit-learn svd sklearn-pandas

我不理解sklearn中get_params([deep])可用的TruncatedSVD方法。有人可以向我解释一下吗?

1 个答案:

答案 0 :(得分:0)

在此处查看get_params的来源:https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/base.py#L213

不仅仅是TruncatedSVD,基本上所有的scikit-estimators都包含这个方法,因为它们都从BaseEstimator类继承了这个方法。

Ans如名称所示,它将给出类中设置的参数值。在您的情况下,请在此处查看参数列表:http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.TruncatedSVD.html

n_components : int, default = 2
algorithm : string, default = “randomized"
n_iter : int, optional (default 5)
random_state : int, RandomState instance or None, optional, default = None
tol : float, optional

假设您使用以下代码初始化TruncatedSVD:

svd = TruncatedSVD(n_components=5, n_iter=7, random_state=42)

输出将是:

{'algorithm': 'randomized',
 'n_components': 5,
 'n_iter': 7,
 'random_state': 42,
 'tol': 0.0}

这对于制作对象的克隆很有用,并且广泛用于各种scikit学习实用程序,如cross_val_score,GridSearchCV,Pipeline等。

如果deep = True,它将返回内部估计器的参数(如果有的话)。 例如,请使用以下代码:

from sklearn import svm
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import f_regression
from sklearn.pipeline import Pipeline
anova_filter = SelectKBest(f_regression, k=5)
clf = svm.SVC(kernel='linear')
anova_svm = Pipeline([('anova', anova_filter), ('svc', clf)])

anova_svm.get_params(deep=False)的输出如下:

{'memory': None,
 'steps': [('anova',
   SelectKBest(k=5, score_func=<function f_regression at 0x7fb34d50ede8>)),
  ('svc', SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
     decision_function_shape='ovr', degree=3, gamma='auto', kernel='linear',
     max_iter=-1, probability=False, random_state=None, shrinking=True,
     tol=0.001, verbose=False))]}

以下是anova_svm.get_params(True)的代码:

{'anova': SelectKBest(k=5, score_func=<function f_regression at 0x7fb34d50ede8>),
 'anova__k': 5,
 'anova__score_func': <function sklearn.feature_selection.univariate_selection.f_regression>,
 'memory': None,
 'steps': [('anova',
   SelectKBest(k=5, score_func=<function f_regression at 0x7fb34d50ede8>)),
  ('svc', SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
     decision_function_shape='ovr', degree=3, gamma='auto', kernel='linear',
     max_iter=-1, probability=False, random_state=None, shrinking=True,
     tol=0.001, verbose=False))],
 'svc': SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
   decision_function_shape='ovr', degree=3, gamma='auto', kernel='linear',
   max_iter=-1, probability=False, random_state=None, shrinking=True,
   tol=0.001, verbose=False),
 'svc__C': 1.0,
 'svc__cache_size': 200,
 'svc__class_weight': None,
 'svc__coef0': 0.0,
 'svc__decision_function_shape': 'ovr',
 'svc__degree': 3,
 'svc__gamma': 'auto',
 'svc__kernel': 'linear',
 'svc__max_iter': -1,
 'svc__probability': False,
 'svc__random_state': None,
 'svc__shrinking': True,
 'svc__tol': 0.001,
 'svc__verbose': False}

您可以看到输出现在包含svm和selectkbest参数的值,它们是管道估算器的内部估算器。