Go中有没有办法比较两个切片并获得切片X中不在切片Y中的元素,反之亦然?
X := []int{10, 12, 12, 12, 13}
Y := []int{12, 14, 15}
func compare(X, Y []int)
calling compare(X, Y)
result1 := []int{10, 12, 12, 13} // if you're looking for elements in slice X that are not in slice Y
calling compare(Y, X)
result2 := []int{14, 15} // if you're looking for elements in slice Y that are not in slice X
答案 0 :(得分:7)
如果顺序不重要且集合很大,则应使用set实现,并使用diff函数对它们进行比较。
集合不是标准库的一部分,但您可以使用此库,例如,您可以使用它自动从切片初始化集合。 https://github.com/deckarep/golang-set
这样的事情:
import (
set "github.com/deckarep/golang-set"
"fmt"
)
func main() {
//note that the set accepts []interface{}
X := []interface{}{10, 12, 12, 12, 13}
Y := []interface{}{12, 14, 15}
Sx := set.NewSetFromSlice(X)
Sy := set.NewSetFromSlice(Y)
result1 := Sx.Difference(Sy)
result2 := Sy.Difference(Sx)
fmt.Println(result1)
fmt.Println(result2)
}
答案 1 :(得分:3)
所提供的所有解决方案都未能准确回答所提出的问题。解决方案不是切片中的差异,而是提供切片中元素集的差异。
具体而言,而不是预期的例子:
X := []int{10, 12, 12, 12, 13}
Y := []int{12, 14, 15}
func compare(X, Y []int)
calling compare(X, Y)
result1 := []int{10, 12, 12, 13} // if you're looking for elements in slice X that are not in slice Y
calling compare(Y, X)
result2 := []int{14, 15}
提供的解决方案将导致:
result1 := []int{10,13}
result2 := []int{14,15}
为了严格地产生示例结果,需要不同的方法。这有两个解决方案:
如果切片已经排序:
如果您对切片进行排序,然后调用compare,则此解决方案可能会更快。如果您的切片已经排序,它肯定会更快。
func compare(X, Y []int) []int {
difference := make([]int, 0)
var i, j int
for i < len(X) && j < len(Y) {
if X[i] < Y[j] {
difference = append(difference, X[i])
i++
} else if X[i] > Y[j] {
j++
} else { //X[i] == Y[j]
j++
i++
}
}
if i < len(X) { //All remaining in X are greater than Y, just copy over
finalLength := len(X) - i + len(difference)
if finalLength > cap(difference) {
newDifference := make([]int, finalLength)
copy(newDifference, difference)
copy(newDifference[len(difference):], X[i:])
difference = newDifference
} else {
differenceLen := len(difference)
difference = difference[:finalLength]
copy(difference[differenceLen:], X[i:])
}
}
return difference
}
使用地图的未排序版本
func compareMapAlternate(X, Y []int) []int {
counts := make(map[int]int)
var total int
for _, val := range X {
counts[val] += 1
total += 1
}
for _, val := range Y {
if count := counts[val]; count > 0 {
counts[val] -= 1
total -= 1
}
}
difference := make([]int, total)
i := 0
for val, count := range counts {
for j := 0; j < count; j++ {
difference[i] = val
i++
}
}
return difference
}
修改:我已经为测试我的两个版本创建了一个基准测试(地图稍作修改,从地图中删除零值)。它不会在Go Playground上运行,因为Time在它上面没有正常工作,所以我在自己的电脑上运行它。
compareSort对切片进行排序并调用compare的迭代版本,compareSorted在compareSort之后运行,但依赖于已经排序的切片。
Case: len(X)== 10000 && len(Y)== 10000
--compareMap time: 4.0024ms
--compareMapAlternate time: 3.0225ms
--compareSort time: 4.9846ms
--compareSorted time: 1ms
--Result length == 6754 6754 6754 6754
Case: len(X)== 1000000 && len(Y)== 1000000
--compareMap time: 378.2492ms
--compareMapAlternate time: 387.2955ms
--compareSort time: 816.5619ms
--compareSorted time: 28.0432ms
--Result length == 673505 673505 673505 673505
Case: len(X)== 10000 && len(Y)== 1000000
--compareMap time: 35.0269ms
--compareMapAlternate time: 43.0492ms
--compareSort time: 385.2629ms
--compareSorted time: 3.0242ms
--Result length == 3747 3747 3747 3747
Case: len(X)== 1000000 && len(Y)== 10000
--compareMap time: 247.1561ms
--compareMapAlternate time: 240.1727ms
--compareSort time: 400.2875ms
--compareSorted time: 17.0311ms
--Result length == 993778 993778 993778 993778
正如您所看到的,如果对数组进行排序而不使用地图要快得多,但使用地图比排序它然后使用迭代方法更快。对于小的情况,排序可能足够快,应该使用它,但基准测试将很快完成定时。
答案 2 :(得分:2)
这样的事情应该有效:
package main
import "fmt"
func main() {
X := []int{10, 12, 12, 12, 13}
Y := []int{12, 14, 15}
fmt.Println(compare(X, Y))
fmt.Println(compare(Y, X))
}
func compare(X, Y []int) []int {
m := make(map[int]int)
for _, y := range Y {
m[y]++
}
var ret []int
for _, x := range X {
if m[x] > 0 {
m[x]--
continue
}
ret = append(ret, x)
}
return ret
}
答案 3 :(得分:0)
在集合实现中拖动可能有点矫枉过正。 This should be enough:
func diff(X, Y []int) []int {
diff := []int{}
vals := map[int]struct{}{}
for _, x := range X {
vals[x] = struct{}{}
}
for _, x := range Y {
if _, ok := vals[x]; !ok {
diff = append(diff, x)
}
}
return diff
}
如果切片变得非常大,可以添加布隆过滤器。
答案 4 :(得分:0)
还有github.com/mb0/diff包(docs at godoc.org):
Package diff实现差异算法。该算法在"An O(ND) Difference Algorithm and its Variations",Eugene Myers,Algorithmica Vol。 1986年第1期,第251-266页。