为什么来自sklearn的LabelBinarizer是否缓慢?

时间:2017-12-14 19:22:51

标签: python performance scikit-learn

我正在尝试将LabelBinarizer sklearn的效果与简单词典进行比较:

from sklearn.preprocessing import LabelBinarizer
import time

sample_list = list('abcdefg')
lb = LabelBinarizer()
lb.fit(dep_tag_list)
lb_t = lb.transform(sample_list)
sample_dict = {key:value for (key,value) in zip(sample_list, lb_t)}

此代码运行:--- 2.9169740676879883秒---

start_time = time.time()
result = lb.transform(sample_list*1000000)
print("--- %s seconds ---" % (time.time() - start_time))

此代码运行:--- 0.6299951076507568秒---

start_time = time.time()
result = [sample_dict[el] for el in sample_list*1000000]
print("--- %s seconds ---" % (time.time() - start_time))

我比较苹果和苹果吗?为什么LableBinarizer如此之慢?

1 个答案:

答案 0 :(得分:1)

LabelBinarizer是label_binarize的包装器。并且也可以在内部的其他一些scikit实用程序中使用。因此,需要注意传递给它的数据是否合适。

为此,它会对传递的数据执行多次检查。请查看transform() function here的源代码:

y_is_multilabel = type_of_target(y).startswith('multilabel')
if y_is_multilabel and not self.y_type_.startswith('multilabel'):
    raise ValueError("The object was not fitted with multilabel"
                     " input.")

return label_binarize(y, self.classes_,
                      pos_label=self.pos_label,
                      neg_label=self.neg_label,
                      sparse_output=self.sparse_output)

所以你看到它检查传递的y是否是合适的类型,可以由scikit算法处理。之后,数据传递给label_binarize - (void)addAnimation { // do animation CABasicAnimation *drawAnimation = [CABasicAnimation animationWithKeyPath:@"strokeEnd"]; drawAnimation.duration = 3.f; drawAnimation.repeatCount = 1.0; drawAnimation.fromValue = [NSNumber numberWithFloat:0.0f]; drawAnimation.toValue = [NSNumber numberWithFloat:0.5f]; drawAnimation.timingFunction = [CAMediaTimingFunction functionWithName:kCAMediaTimingFunctionEaseInEaseOut]; drawAnimation.fillMode = kCAFillModeForwards; drawAnimation.removedOnCompletion = NO; [self.progressLayer addAnimation:drawAnimation forKey:@"drawCircleAnimation"]; [self.progressLayer addObserver:self forKeyPath:@"strokeEnd" options:NSKeyValueObservingOptionNew context:NULL]; // 监听position } - (void)observeValueForKeyPath:(NSString *)keyPath ofObject:(id)object change:(NSDictionary *)change context:(void *)context { NSLog(@"change:%@",change); // not called here... } <ion-buttons end> <button ion-button clear (click)="toggleFilterSort()"> <i class="fa fa-ellipsis-v fa-fw" aria-hidden="true"></i> </button> </ion-buttons> 对其执行其他额外检查。在我看来,这是它缓慢的原因。