Regex performance improvement

时间:2016-08-30 04:47:33

标签: c# .net regex

How can I improve the performance of this regex:

^([A-Za-z0-9+/=])*$

This is used to match string having 0 or more of A-Z, a-z, 0-9, +, /. =

EDIT: This is the code:

internal static class Base64Determinator
    {
        private static Regex base64Regex;

        static Base64Determinator()
        {
            var base64Pattern = "^(?>[A-Za-z0-9+/=]?)*$";
            base64Regex = new Regex(base64Pattern, RegexOptions.Compiled);
        }

        internal static bool IsBase64(string input)
        {
            return input != null && base64Regex.IsMatch(input);
        }
    }

I am testing it in a loop in nunit:

[Test]
        public void TestInputs()
        {
            var inputs = new List<Tuple<string, bool>>
            {
                new Tuple<string, bool>(null, false),
                new Tuple<string, bool>("asd", true),
                new Tuple<string, bool>("qwertyuiop", true),
                new Tuple<string, bool>("QWERTYUIOP", true),
                new Tuple<string, bool>(
                    "QWERTYUIOPrfweiowcq489ynOILSDKFJSLDJfLKsdjflksdjflskdjfLSKDJflkLSKJFWIOEFJOIJSFLDKJflSDJFLKSJfsl",
                    true),
                new Tuple<string, bool>("=+/Z0", true),
                new Tuple<string, bool>("=+/Z01234567890", true),
                new Tuple<string, bool>("!@#$%^&*()P", false),
                new Tuple<string, bool>("!", false),
                new Tuple<string, bool>("<", false),
                new Tuple<string, bool>(">", false),
                new Tuple<string, bool>(".", false),
                new Tuple<string, bool>("qwertyuiopoiuytrewqwertyuioplkjhgfdsadfghjklmnbvcxzxcvbnm<script>", false),
                new Tuple<string, bool>("qwertyuiopoiuytrewqwertyuioplkjhgfdsadfghjklmnbvcxzxcvbnm", true),
            };

            var startNew = Stopwatch.StartNew();
            for (int j = 0; j < 10000000; j++)
            {
                //inputs.ForEach(i => Assert.AreEqual(i.Item2, Base64Determinator.IsBase64(i.Item1)));
                inputs.ForEach(i => Base64Determinator.IsBase64(i.Item1));
            }

            startNew.Stop();

            Console.WriteLine(startNew.Elapsed);
        }

This takes approx 2.36 mins. I have seen other implementations that take 2 secs. So I am wondering if there is a better way to form this regex with minimum backtracking.

Thanks, Rashmi

1 个答案:

答案 0 :(得分:0)

性能取决于输入的复杂性。您可以使用Regex.MatchTimeout属性。 您可以根据需要定义时间跨度(例如200毫秒),并且可以显示超时的相应消息。

Regex.Match方法(String,String,RegexOptions,TimeSpan)