我用 sklearn 管道构建了一个预处理管道,如下所示:
model_pipeline = Pipeline(steps=[('pre processing categorical', pre_process_categorical),
('standardizing scale', StandardScaler()),
('K feature selector', SelectKBest()),
('forward feature selection', RFECV())])
我想查看转换后保留的列名 我查看了每个阶段的索引并找到了以下结果:
model_pipeline.steps[-2][1].get_support(indices = True)
Out[50]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17, 18, 20, 21, 22, 27, 28, 31, 32, 33, 34, 36, 37, 40, 43, 44, 46, 47, 49, 52, 53, 54, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 68, 69, 70, 71, 74, 76, 80, 81, 82, 83, 84, 85, 89, 90, 91, 92, 98, 102, 103, 109, 119, 121, 123, 125, 130, 136, 138, 146, 151, 152, 153, 157, 158, 162, 163, 164, 165, 167, 170, 171, 172, 173, 176, 183, 185, 186, 192, 194, 195, 199, 203, 206, 208, 215, 216, 220, 223, 232, 249, 252, 253, 254, 255, 257, 258, 259, 260, 261, 262, 263, 264, 265, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 282, 283, 285, 286, 288, 290, 292, 294, 295, 297, 298, 299、300、304、306、308、312、315])
model_pipeline.steps[-1][1].get_support(indices = True)
Out[51]: array([ 2, 5, 18, 22, 33, 36, 38, 43, 114, 122, 125, 127, 137, 142, 143, 144, 145, 146, 148, 149])
我无法理解某些索引(例如:38)在最后一步中如何存在,但在倒数第二个中丢失了??
所以两个问题: