我正在上在线课程Feature Selection in Machine Learning
。导师教我如何选择我清楚理解的二进制分类中的特征。伪代码如下
- First, builds one decision tree per feature, to predict the target
- Second, makes predictions using the decision tree and the mentioned feature
- Third, ranks the features according to the machine learning metric (roc-auc)
- Fourth, selects all features with roc_auc>0.5
尽管,他没有提到多类分类的roc_auc
条件,但我认为应该将其概括如下
Fourth, selects all features with roc_auc>(1/#classes to predict)
有人能在这方面提供更多见识吗?