Array.Contains运行速度非常慢,任何人都可以轻松一点?

时间:2011-09-18 00:43:06

标签: c# linq

我已经做了一些关于List.Contains,Array.Contains,IEnumerable.Contains,ICollection.Contains和IList.Contains的基准测试。

结果是:

array pure 00:00:45.0052754 // 45 sec, slow
array as IList 00:00:02.7900305
array as IEnumerable 00:00:46.5871087 // 45 sec, slow
array as ICollection 00:00:02.7449889
list pure 00:00:01.9907563
list as IList 00:00:02.6626009
list as IEnumerable 00:00:02.9541950
list as ICollection 00:00:02.3341203

我发现直接调用Array.Contains会非常慢(这相当于调用IEnumerable)

此外,我觉得奇怪的是MSDN数组页面没有在扩展方法部分列出的contains方法。

示例代码:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;

namespace arrayList
{
    class Program
    {
        static void Main(string[] args)
        {
            Stopwatch watch = new Stopwatch();
            Int64 n = 100000000;
            Int64[] myarray = new Int64[] { 1, 2, 3 };
            List<Int64> mylist = new List<Int64>(myarray);
            watch.Start();
            for (Int64 j = 0; j < n; j++)
            {

                bool i = myarray.Contains(2);

            }
            watch.Stop();
            Console.WriteLine("array pure {0}", watch.Elapsed);

            watch.Restart();
            for (Int64 j = 0; j < n; j++)
            {

                bool i = (myarray as IList<Int64>).Contains(2);

            }
            watch.Stop();
            Console.WriteLine("array as IList {0}",watch.Elapsed);

            watch.Restart();
            for (Int64 j = 0; j < n; j++)
            {

                bool i = (myarray as IEnumerable<Int64>).Contains(2);

            }
            watch.Stop();
            Console.WriteLine("array as IEnumerable {0}",watch.Elapsed);
            watch.Restart();
            for (Int64 j = 0; j < n; j++)
            {

                bool i = (myarray as ICollection<Int64>).Contains(2);

            }
            watch.Stop();
            Console.WriteLine("array as ICollection {0}",watch.Elapsed);

            watch.Restart();
            for (Int64 j = 0; j < n; j++)
            {

                bool i = mylist.Contains(2);

            }
            watch.Stop();
            Console.WriteLine("list pure {0}", watch.Elapsed);

            watch.Restart();
            for (Int64 j = 0; j < n; j++)
            {

                bool i = (mylist as IList<Int64>).Contains(2);

            }
            watch.Stop();
            Console.WriteLine("list as IList {0}", watch.Elapsed);

            watch.Restart();
            for (Int64 j = 0; j < n; j++)
            {

                bool i = (mylist as IEnumerable<Int64>).Contains(2);

            }
            watch.Stop();
            Console.WriteLine("list as IEnumerable {0}", watch.Elapsed);
            watch.Restart();
            for (Int64 j = 0; j < n; j++)
            {

                bool i = (mylist as ICollection<Int64>).Contains(2);

            }
            watch.Stop();
            Console.WriteLine("list as ICollection {0}", watch.Elapsed);
            Console.ReadKey();
        }
    }
}

3 个答案:

答案 0 :(得分:3)

你计时的方式还不够。您需要更大的输入才能获得代表算法的时间。是Contains()将比简单的线性搜索(你省略的东西)慢,但不同的调用不会像你所示的那样有时间。在投射到不同类型时,你很可能看不到对Contains()的调用之间的任何变化,我们正在为所有类型调用相同的实现。

请尝试使用以下代码:

using System;
using System.Collections.Generic;
using System.Linq;

using System.Diagnostics;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            const int iterations = 1000000;
            const long target = 7192;
            var arr = Enumerable.Range(0, 10000).Select(i => (long)i).ToArray();
            var list = arr.ToList();

            bool result;

            var arr0 = Stopwatch.StartNew();
            for (var i = 0; i < iterations; i++)
            {
                result = LinearSearchArr(arr, target);
            }
            arr0.Stop();

            var arr1 = Stopwatch.StartNew();
            for (var i = 0; i < iterations; i++)
            {
                // actually Enumerable.Contains()
                result = arr.Contains(target);
            }
            arr1.Stop();

            var arr2 = Stopwatch.StartNew();
            for (var i = 0; i < iterations; i++)
            {
                result = ((IList<long>)arr).Contains(target);
            }
            arr2.Stop();

            var arr3 = Stopwatch.StartNew();
            for (var i = 0; i < iterations; i++)
            {
                result = ((IEnumerable<long>)arr).Contains(target);
            }
            arr3.Stop();

            var arr4 = Stopwatch.StartNew();
            for (var i = 0; i < iterations; i++)
            {
                result = ((ICollection<long>)arr).Contains(target);
            }
            arr4.Stop();

            var list0 = Stopwatch.StartNew();
            for (var i = 0; i < iterations; i++)
            {
                result = LinearSearchList(list, target);
            }
            list0.Stop();

            var list1 = Stopwatch.StartNew();
            for (var i = 0; i < iterations; i++)
            {
                result = list.Contains(target);
            }
            list1.Stop();

            var list2 = Stopwatch.StartNew();
            for (var i = 0; i < iterations; i++)
            {
                result = ((IList<long>)list).Contains(target);
            }
            list2.Stop();

            var list3 = Stopwatch.StartNew();
            for (var i = 0; i < iterations; i++)
            {
                result = ((IEnumerable<long>)list).Contains(target);
            }
            list3.Stop();

            var list4 = Stopwatch.StartNew();
            for (var i = 0; i < iterations; i++)
            {
                result = ((ICollection<long>)list).Contains(target);
            }
            list4.Stop();

            Console.WriteLine("array linear {0} ({1})", arr0.Elapsed, arr0.ElapsedTicks);
            Console.WriteLine("array pure {0} ({1})", arr1.Elapsed, arr1.ElapsedTicks);
            Console.WriteLine("array as IList {0} ({1})", arr2.Elapsed, arr2.ElapsedTicks);
            Console.WriteLine("array as IEnumerable {0} ({1})", arr3.Elapsed, arr3.ElapsedTicks);
            Console.WriteLine("array as ICollection {0} ({1})", arr4.Elapsed, arr4.ElapsedTicks);
            Console.WriteLine("list linear {0} ({1})", list0.Elapsed, list0.ElapsedTicks);
            Console.WriteLine("list pure {0} ({1})", list1.Elapsed, list1.ElapsedTicks);
            Console.WriteLine("list as IList {0} ({1})", list2.Elapsed, list2.ElapsedTicks);
            Console.WriteLine("list as IEnumerable {0} ({1})", list3.Elapsed, list3.ElapsedTicks);
            Console.WriteLine("list as ICollection {0} ({1})", list4.Elapsed, list4.ElapsedTicks);
        }

        static bool LinearSearchArr(long[] arr, long target)
        {
            for (var i = 0; i < arr.Length; i++)
            {
                if (arr[i] == target)
                {
                    return true;
                }
            }
            return false;
        }

        static bool LinearSearchList(List<long> list, long target)
        {
            for (var i = 0; i < list.Count; i++)
            {
                if (list[i] == target)
                {
                    return true;
                }
            }
            return false;
        }
    }
}

规格:
Windows 7专业版64位
英特尔酷睿2四核Q9550 @ 2.83GHz
4x1GiB Corsair Dominator DDR2 1066(PC2-8500)

默认的.NET 4.0控制台应用程序发布版本定位x64:

array linear 00:00:07.7268891 (21379939)
array pure 00:00:12.1703848 (33674883)
array as IList 00:00:12.1764948 (33691789)
array as IEnumerable 00:00:12.5377771 (34691440)
array as ICollection 00:00:12.1827855 (33709195)
list linear 00:00:17.9288343 (49608242)
list pure 00:00:25.8427338 (71505630)
list as IList 00:00:25.8678260 (71575059)
list as IEnumerable 00:00:25.8500101 (71525763)
list as ICollection 00:00:25.8423424 (71504547)

答案 1 :(得分:0)

猜猜:IList / List使用ICollection.Contains直接通过索引遍历集合中的元素。

Array和IEnumerable版本使用IEnumerable.Contains,它需要创建枚举器并运行genreic迭代代码(如MoveNext调用)。

答案 2 :(得分:0)

确保在代码中以某种方式使用Contains方法的结果,这样它就不会对其进行优化。我猜测在一种情况下它可以使用散列表,而在其他情况下它必须进行线性搜索。要么是因为它没有运行你的循环,因为它没有做任何事情。

无论哪种方式,谁来编写代码然后运行包含一百万次...