我经常需要过滤array
个字符串的元素,包含一些子字符串(例如一个字符)。由于可以通过匹配regex
或.contains
方法来完成,因此我决定进行一项小测试以确保.contains
更快(因此更合适) )。
my @array = "aa" .. "cc";
my constant $substr = 'a';
my $time1 = now;
my @a_array = @array.grep: *.contains($substr);
my $time2 = now;
@a_array = @array.grep: * ~~ /$substr/;
my $time3 = now;
my $time_contains = $time2 - $time1;
my $time_regex = $time3 - $time2;
say "contains: $time_contains sec";
say "regex: $time_regex sec";
然后我更改@array
的大小和$substr
的长度,并比较每种方法过滤@array
所用的时间。在大多数情况下(正如预期的那样),.contains
比regex
快得多,尤其是@array
很大的情况。但是如果小@array
(如上面的代码中所示)regex
稍快一点。
contains: 0.0015010 sec
regex: 0.0008708 sec
为什么会这样?
答案 0 :(得分:4)
In an entirely unscientific experiment I just switched the regex version and the contains version around and found that the difference in performance you're measuring is not "regex vs contains" but in fact "first thing versus second thing":
When contains comes first:
contains: 0.001555 sec
regex: 0.0009051 sec
When regex comes first:
regex: 0.002055 sec
contains: 0.000326 sec
Benchmarking properly is a difficult task. It's really easy to accidentally measure something different from what you wanted to figure out.
When I want to compare the performance of multiple things I will usually run each thing in a separate script, or maybe have a shared script but only run one of the tasks at once (for example using a multi sub MAIN("task1")
approach). That way any startup work gets shared.
In the #perl6 IRC channel on freenode we have a bot called benchable6 which can do benchmarks for you. Read the section "Comparing Code" on its wiki page的属性,以了解如何为您比较两段代码。