阵列交叉功能速度

时间:2010-09-08 22:19:18

标签: vb.net function array-intersect

我为数组交集写了一个简短的函数,想知道为什么一个函数比另一个函数更快。

1)

Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length

Function newintersect(ByRef list1() As String) As String()
    Dim intersection As New ArrayList
    If (list1.Length < list2length) Then
        'use list2'
        For Each thing As String In list2
            If (Array.IndexOf(list1, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    Else
        'use list1'
        For Each thing As String In list1
            If (Array.IndexOf(list2, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    End If
    Return intersection
End Function

2)

Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length

Function newintersect(ByRef list1() As String) As String()
    Dim intersection As New ArrayList
    If (list1.Length > list2length) Then 'changed >'
        'use list2'
        For Each thing As String In list2
            If (Array.IndexOf(list1, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    Else
        'use list1'
        For Each thing As String In list1
            If (Array.IndexOf(list2, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    End If
    Return intersection
End Function

3)

Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length

Function newintersect(ByRef list1() As String) As String()
    For Each thing As String In list1
        If (Array.IndexOf(list2, thing) <> -1) Then
            intersection.Add(thing)
        End If
    Next
    Return intersection
End Function

因此,对于我的测试用例,1需要65秒,3需要63秒,而2实际需要75秒。谁知道为什么3是最快的?为什么1比2快?

(抱歉格式不佳......似乎无法正确粘贴)

3 个答案:

答案 0 :(得分:1)

这没什么区别。此外,似乎方法不会产生相同的结果,因此比较性能是没有意义的,对吗?

无论如何,Array.IndexOf不是一种非常有效的查找项目的方法,并且不能很好地扩展。如果你使用基于散列键的集合作为查找,你应该得到一个显着的改进,如下所示:

Function newintersect(ByRef list1 As String(), ByRef list2 As String()) As String()
  Dim smaller As HashSet(Of String)
  Dim larger As String()
  If list1.Length < list2.Length Then
    smaller = New HashSet(Of String)(list1)
    larger = list2
  Else
    smaller = New HashSet(Of String)(list2)
    larger = list1
  End If
  Dim intersection As New List(Of String)
  For Each item As String In larger
    If smaller.Contains(item) Then
      intersection.Add(item)
    End If
  Next
  Return intersection.ToArray()
End Function

答案 1 :(得分:0)

我希望您会发现,对于不同的测试用例,您可以反转上面的结果并达到2最快且1&amp; 2的情况。 3比较慢。

在不知道测试用例的构成的情况下很难评论,它将取决于两个数组中“相交”项的位置 - 如果它们往往靠近一个阵列的前面并且更接近于在另一个结束时,数组迭代/ IndexOf的嵌套顺序将具有明显不同的性能。

BTW - 有更好的方法来执行交集 - 排序一个或其他数组并执行BinarySearch是一种方法 - 使用Dictionary(Of String,...)或类似的是另一种 - 并且要么会更好性能

答案 2 :(得分:0)

这是来自MSDN文档

    Dim id1() As Integer = {44, 26, 92, 30, 71, 38}
    Dim id2() As Integer = {39, 59, 83, 47, 26, 4, 30}

    ' Find the set intersection of the two arrays.
    Dim intersection As IEnumerable(Of Integer) = id1.Intersect(id2)

    For Each id As Integer In intersection
        Debug.WriteLine(id.ToString)
    Next