比较两个切片

时间:2014-05-26 12:23:05

标签: go comparison slice

Go中有没有办法比较两个切片并获得切片X中不在切片Y中的元素,反之亦然?

    X := []int{10, 12, 12, 12, 13}
    Y := []int{12, 14, 15}

func compare(X, Y []int)  

calling compare(X, Y)   
    result1 := []int{10, 12, 12, 13} // if you're looking for elements in slice X that are not in slice Y

calling compare(Y, X)
    result2 := []int{14, 15} // if you're looking for elements in slice Y that are not in slice X

5 个答案:

答案 0 :(得分:7)

如果顺序不重要且集合很大,则应使用set实现,并使用diff函数对它们进行比较。

集合不是标准库的一部分,但您可以使用此库,例如,您可以使用它自动从切片初始化集合。 https://github.com/deckarep/golang-set

这样的事情:

import (
    set "github.com/deckarep/golang-set"
    "fmt"
    )

func main() {
    //note that the set accepts []interface{}
    X := []interface{}{10, 12, 12, 12, 13}
    Y := []interface{}{12, 14, 15}

    Sx := set.NewSetFromSlice(X)
    Sy := set.NewSetFromSlice(Y)
    result1 := Sx.Difference(Sy)
    result2 := Sy.Difference(Sx)

    fmt.Println(result1)
    fmt.Println(result2)
}

答案 1 :(得分:3)

所提供的所有解决方案都未能准确回答所提出的问题。解决方案不是切片中的差异,而是提供切片中元素集的差异。

具体而言,而不是预期的例子:

    X := []int{10, 12, 12, 12, 13}
    Y := []int{12, 14, 15}

func compare(X, Y []int)  

calling compare(X, Y)   
    result1 := []int{10, 12, 12, 13} // if you're looking for elements in slice X that are not in slice Y

calling compare(Y, X)
    result2 := []int{14, 15}

提供的解决方案将导致:

result1 := []int{10,13}
result2 := []int{14,15}

为了严格地产生示例结果,需要不同的方法。这有两个解决方案:

如果切片已经排序:

如果您对切片进行排序,然后调用compare,则此解决方案可能会更快。如果您的切片已经排序,它肯定会更快。

func compare(X, Y []int) []int {
    difference := make([]int, 0)
    var i, j int
    for i < len(X) && j < len(Y) {
        if X[i] < Y[j] {
            difference = append(difference, X[i])
            i++
        } else if X[i] > Y[j] {
            j++
        } else { //X[i] == Y[j]
            j++
            i++
        }
    }
    if i < len(X) { //All remaining in X are greater than Y, just copy over
        finalLength := len(X) - i + len(difference)
        if finalLength > cap(difference) {
            newDifference := make([]int, finalLength)
            copy(newDifference, difference)
            copy(newDifference[len(difference):], X[i:])
            difference = newDifference
        } else {
            differenceLen := len(difference)
            difference = difference[:finalLength]
            copy(difference[differenceLen:], X[i:])
        }
    }
    return difference
}

Go Playground version

使用地图的未排序版本

func compareMapAlternate(X, Y []int) []int {
    counts := make(map[int]int)
    var total int
    for _, val := range X {
        counts[val] += 1
        total += 1
    }
    for _, val := range Y {
        if count := counts[val]; count > 0 {
            counts[val] -= 1
            total -= 1
        }
    }
    difference := make([]int, total)
    i := 0
    for val, count := range counts {
        for j := 0; j < count; j++ {
            difference[i] = val
            i++
        }
    }
    return difference
}

Go Playground version

修改:我已经为测试我的两个版本创建了一个基准测试(地图稍作修改,从地图中删除零值)。它不会在Go Playground上运行,因为Time在它上面没有正常工作,所以我在自己的电脑上运行它。

compareSort对切片进行排序并调用compare的迭代版本,compareSorted在compareSort之后运行,但依赖于已经排序的切片。

Case: len(X)== 10000 && len(Y)== 10000
--compareMap time: 4.0024ms
--compareMapAlternate time: 3.0225ms
--compareSort time: 4.9846ms
--compareSorted time: 1ms
--Result length == 6754 6754 6754 6754
Case: len(X)== 1000000 && len(Y)== 1000000
--compareMap time: 378.2492ms
--compareMapAlternate time: 387.2955ms
--compareSort time: 816.5619ms
--compareSorted time: 28.0432ms
--Result length == 673505 673505 673505 673505
Case: len(X)== 10000 && len(Y)== 1000000
--compareMap time: 35.0269ms
--compareMapAlternate time: 43.0492ms
--compareSort time: 385.2629ms
--compareSorted time: 3.0242ms
--Result length == 3747 3747 3747 3747
Case: len(X)== 1000000 && len(Y)== 10000
--compareMap time: 247.1561ms
--compareMapAlternate time: 240.1727ms
--compareSort time: 400.2875ms
--compareSorted time: 17.0311ms
--Result length == 993778 993778 993778 993778

正如您所看到的,如果对数组进行排序而不使用地图要快得多,但使用地图比排序它然后使用迭代方法更快。对于小的情况,排序可能足够快,应该使用它,但基准测试将很快完成定时。

答案 2 :(得分:2)

这样的事情应该有效:

package main

import "fmt"

func main() {
    X := []int{10, 12, 12, 12, 13}
    Y := []int{12, 14, 15}

    fmt.Println(compare(X, Y))
    fmt.Println(compare(Y, X))
}

func compare(X, Y []int) []int {
    m := make(map[int]int)

    for _, y := range Y {
        m[y]++
    }

    var ret []int
    for _, x := range X {
        if m[x] > 0 {
            m[x]--
            continue
        }
        ret = append(ret, x)
    }

    return ret
}

http://play.golang.org/p/4DujR2staI

答案 3 :(得分:0)

在集合实现中拖动可能有点矫枉过正。 This should be enough

func diff(X, Y []int) []int {

  diff := []int{}
  vals := map[int]struct{}{}

  for _, x := range X {
    vals[x] = struct{}{}
  }

  for _, x := range Y {
    if _, ok := vals[x]; !ok {
      diff = append(diff, x)
    }
  }

  return diff
}

如果切片变得非常大,可以添加布隆过滤器。

答案 4 :(得分:0)

还有github.com/mb0/diff包(docs at godoc.org):

  

Package diff实现差异算法。该算法在"An O(ND) Difference Algorithm and its Variations",Eugene Myers,Algorithmica Vol。 1986年第1期,第251-266页。