在AS3中编译正则表达式的高性能方法?

时间:2010-10-12 05:43:51

标签: regex actionscript-3

我有一个非常简单的问题。在AS3中编译正则表达式的最佳方法(最高性能/最低内存使用率等)是什么?

例如,是这样的:

private var expression:RegExp = new RegExp(".*a$");

private function modify():void {
    /* uses "expression" to chop up string */
}

比这更快:

private var expression:RegExp = /.*a$/;

private function modify():void {
    /* uses "expression" to chop up string */
}

另外,如果我只打算使用一次,是否真的需要使表达式成为实例变量?例如,理论上,下列哪个代码块的执行速度会更快:

private var myRegEx:RegExp = /\n/;    

private function modify1():void {
    myString.split(/\n/);
}

private function modify2():void {
    myString.split(myRegEx);
}

modify1()会以与modify2()相同的执行速度运行吗?我的意思是,AS3是否在modify1()中编译了一个新的RegExp实例,因为它没有绑定到实例变量?

任何帮助都将非常感激:)

2 个答案:

答案 0 :(得分:5)

你的考试不是很好。原因如下:

  1. getTimer衡量时间,但cpu-time确实很重要。如果在某个时刻,由于某种原因,调度程序决定不运行Flash播放器,那么在同一时间范围内你的cpu-time就会减少。这就是结果变化的原因。这是你可以使用的最好的,但如果你试图追踪几个百分点的偏差,实际上没什么帮助。
  2. 偏差很小。大约8%。部分原因源于第1点中描述的效果。当我运行测试时,结果确实有所不同。 8%可以来自任何地方。他们甚至可能只依赖于您的机器,操作系统或次要玩家版本或其他任何东西。结果只是没有足够的依赖性。另外8%的加速并不值得考虑,除非你发现,你的字符串处理中存在一个可怕的瓶颈,可以通过这个或那个技巧来修复RegExp s
  3. 最狡猾的是:你测量的差异与正则表达式无关,只与测试中的其他内容有关。
  4. 让我详细解释一下 试试public function test7():void{}。在我的机器上,它需要大约30%-40%的其他测试。我们有一些数字:

        Running Tests
        -------------------------------
        Testing method: test1, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01716ms
            Longest Iteration Time: 5ms
            Shortest Iteration Time: 0ms
            Total Test Time: 901ms
        -------------------------------
        Testing method: test2, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01706ms
            Longest Iteration Time: 5ms
            Shortest Iteration Time: 0ms
            Total Test Time: 892ms
        -------------------------------
        Testing method: test3, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01868ms
            Longest Iteration Time: 5ms
            Shortest Iteration Time: 0ms
            Total Test Time: 969ms
        -------------------------------
        Testing method: test4, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01846ms
            Longest Iteration Time: 5ms
            Shortest Iteration Time: 0ms
            Total Test Time: 966ms
        -------------------------------
        Testing method: test5, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01696ms
            Longest Iteration Time: 5ms
            Shortest Iteration Time: 0ms
            Total Test Time: 890ms
        -------------------------------
        Testing method: test6, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01696ms
            Longest Iteration Time: 5ms
            Shortest Iteration Time: 0ms
            Total Test Time: 893ms
        -------------------------------
        Testing method: test7, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.00572ms
            Longest Iteration Time: 1ms
            Shortest Iteration Time: 0ms
            Total Test Time: 306ms
        -------------------------------
    

    但为什么?以下几件事很昂贵:

    • getTimer()调用全局函数(以及其他类的静态方法)很慢
    • (tester[methodName] as Function).apply(); - 这很贵。动态属性访问需要创建闭包,然后转换为匿名函数,然后通过apply调用。我想不出一种更慢的方式来调用函数。
    • var tester:RegExpTester = new RegExpTester(); - 实例化很昂贵,因为它需要分配和初始化。

    以下代码将更好地运行。在我的机器上,test7测量的开销降低了20倍

        private function test(methodName:String, iterations:int = 100):void {
            output("Testing method: " + methodName + ", " + iterations + " iterations...");    
            var start:Number = getTimer();
            var tester:RegExpTester = new RegExpTester();
            var f:Function = tester[methodName];
            for (var i:uint = 0; i < iterations; i++) f();//this call to f still is slower than the direct method call would be
            var wholeTime:Number = getTimer() - start;
            output("Test Complete.");
            output("\tAverage Iteration Time: " + (wholeTime / iterations) + "ms");
            output("\tTotal Test Time: " + wholeTime + "ms");
            output("-------------------------------");
        }
    

    再次,一些数字:

        Running Tests
        -------------------------------
        Testing method: test1, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01094ms
            Total Test Time: 547ms
        -------------------------------
        Testing method: test2, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01094ms
            Total Test Time: 547ms
        -------------------------------
        Testing method: test3, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01296ms
            Total Test Time: 648ms
        -------------------------------
        Testing method: test4, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01288ms
            Total Test Time: 644ms
        -------------------------------
        Testing method: test5, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01086ms
            Total Test Time: 543ms
        -------------------------------
        Testing method: test6, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.01086ms
            Total Test Time: 543ms
        -------------------------------
        Testing method: test7, 50000 iterations...
        Test Complete.
            Average Iteration Time: 0.00028ms
            Total Test Time: 14ms
        -------------------------------
    

    所以现在开销减少到不到1%,这使得它没有任何意义(尽管事实上它可以减少更多)。然而,偏差现在是16%。那是两倍多。它开始看起来更清晰了。恕我直言,它仍然没有任何意义,但实际上它指的是两种最慢的方法:test3test4

    为什么会这样?简单:两种方法都创建一个新的RegExp对象(一个使用文字,另一个使用构造函数)。这消耗了我们可以测量的时差。差异现在更大,因为之前,每次迭代,您创建了3个正则表达式(两个实例变量初始化每次时间实例化RegExpTester)。但现在留下的差异是创建50000 RegExp个实例。其他任何事情都同样快。

    如果在回答您的问题时得出结论:文字或构建RegExp之间存在差异。所以我担心,答案是:“只要你记住一般性能优化规则,这并不重要。”希望有所帮助。

答案 1 :(得分:2)

对于给定的场景,我编写了一个测试类,它为我提供了使用哪种类型的正则表达式所需的所有信息:

package {
    import flash.utils.getTimer;
    import flash.text.TextFormat;
    import flash.text.TextField;
    import flash.display.Sprite;
    public class RegExpTest extends Sprite {

        private var textfield:TextField;

        public function RegExpTest() {
            this.textfield = new TextField();
            this.textfield.x = this.textfield.y = 10;
            this.textfield.width = stage.stageWidth - 20;
            this.textfield.height = stage.stageHeight - 20;
            this.textfield.defaultTextFormat = new TextFormat("Courier New");

            this.addChild(textfield);

            this.runtests();
        }

        private function runtests():void {
            output("Running Tests");
            output("-------------------------------");
            test("test1", 50000);
            test("test2", 50000);
            test("test3", 50000);
            test("test4", 50000);
            test("test5", 50000);
            test("test6", 50000);
        }

        private function test(methodName:String, iterations:int = 100):void {
            output("Testing method: " + methodName + ", " + iterations + " iterations...");    

            var wholeTimeStart:Number = getTimer();
            var iterationTimes:Array = [];

            for (var i:uint = 0; i < iterations; i++) {
                var iterationTimeStart:Number = getTimer();

                var tester:RegExpTester = new RegExpTester();
                // run method.
                (tester[methodName] as Function).apply();

                var iterationTimeEnd:Number = getTimer(); 
                iterationTimes.push(iterationTimeEnd - iterationTimeStart);
            }

            var wholeTimeEnd:Number = getTimer();

            var wholeTime:Number = wholeTimeEnd - wholeTimeStart;

            var average:Number = 0;
            var longest:Number = 0;
                var shortest:Number = int.MAX_VALUE;

            for each (var iteration:int in iterationTimes) {
                average += iteration;

                if (iteration > longest)
                    longest = iteration;

                if (iteration < shortest)
                    shortest = iteration;
            }

            average /= iterationTimes.length;

            output("Test Complete.");
            output("\tAverage Iteration Time: " + average + "ms");
            output("\tLongest Iteration Time: " + longest + "ms");
            output("\tShortest Iteration Time: " + shortest + "ms");
            output("\tTotal Test Time: " + wholeTime + "ms");
            output("-------------------------------");
        }

        private function output(message:String):void {
            this.textfield.appendText(message + "\n");
        }

    }
}

class RegExpTester {

    private static const expression4:RegExp = /.*a$/;

    private static const expression3:RegExp = new RegExp(".*a$");

    private var value:String = "There is a wonderful man which is quite smelly.";

    private var expression1:RegExp = new RegExp(".*a$");

    private var expression2:RegExp = /.*a$/;

    public function RegExpTester() {

    }

    public function test1():void {
        var result:Array = value.split(expression1);
    }

    public function test2():void {
        var result:Array = value.split(expression2);
    }

    public function test3():void {
        var result:Array = value.split(new RegExp(".*a$"));
    }

    public function test4():void {
        var result:Array = value.split(/.*a$/);
    }

    public function test5():void {
        var result:Array = value.split(expression3);
    }

    public function test6():void {
        var result:Array = value.split(expression4);
    }

}

通过运行此示例检索的结果如下:

Running Tests
-------------------------------
Testing method: test1, 50000 iterations...
Test Complete.
Average Iteration Time: 0.0272ms
Longest Iteration Time: 23ms
Shortest Iteration Time: 0ms
Total Test Time: 1431ms
-------------------------------
Testing method: test2, 50000 iterations...
Test Complete.
Average Iteration Time: 0.02588ms
Longest Iteration Time: 5ms
Shortest Iteration Time: 0ms
Total Test Time: 1367ms
-------------------------------
Testing method: test3, 50000 iterations...
Test Complete.
Average Iteration Time: 0.0288ms
Longest Iteration Time: 5ms
Shortest Iteration Time: 0ms
Total Test Time: 1498ms
-------------------------------
Testing method: test4, 50000 iterations...
Test Complete.
Average Iteration Time: 0.0291ms
Longest Iteration Time: 5ms
Shortest Iteration Time: 0ms
Total Test Time: 1495ms
-------------------------------
Testing method: test5, 50000 iterations...
Test Complete.
Average Iteration Time: 0.02638ms
Longest Iteration Time: 5ms
Shortest Iteration Time: 0ms
Total Test Time: 1381ms
-------------------------------
Testing method: test6, 50000 iterations...
Test Complete.
Average Iteration Time: 0.02666ms
Longest Iteration Time: 10ms
Shortest Iteration Time: 0ms
Total Test Time: 1382ms
-------------------------------

至少可以说有趣。看起来我们的传播实际上并不是太大,编译器可能正在做幕后的事情以静态编译正则表达式。值得深思。