golang sort.Sort随机输出并且是错误的

时间:2015-03-13 05:28:01

标签: sorting go

我有一个应用于结构的自定义Sort函数。完整代码为here on play.golang.org

type Stmt struct {
    Name  string
    After []string
}

func sortStmts(stmts []Stmt) []Stmt {
    sort.Sort(ByAfter(stmts))
    return stmts
}

type ByAfter []Stmt

func (a ByAfter) Len() int      { return len(a) }
func (a ByAfter) Swap(i, j int) { a[i], a[j] = a[j], a[i] }
func (a ByAfter) Less(i, j int) bool {
    isLess := true

    //fmt.Printf("%s.%+v is being compared with %s.%+v\n", a[i].Name, a[i].After, a[j].Name, a[j].After)

    for _, v := range a[i].After {
        if a[j].Name == v {
            isLess = false
            break
        }
    }

    if isLess {
        //fmt.Printf("%s.%+v is Before %s.%+v\n", a[i].Name, a[i].After, a[j].Name, a[j].After)
    } else {
        //fmt.Printf("%s.%+v is After %s.%+v\n", a[i].Name, a[i].After, a[j].Name, a[j].After)
    }

    return isLess
}

我的目的是自动创建一组正确排序的sql create语句,以便依赖表首先出现。

因此,如果存在Stmt{Name: "user_role", After: []string{"user", "role"} },则有序列表user_role应位于userrole之后。

在我们开始添加更多值之前,这似乎工作得很好。只有这样,我才进去检查并意识到我可能第一次不小心幸运,但确实没有任何一致性。

在sort函数中我是否有错误,结果是随机的。我特别感兴趣的是,为什么“role”项目没有出现在“user_role”项目之前,即使我已经将user_role指定为角色之后。

3 个答案:

答案 0 :(得分:7)

你的"少"功能不是传递性的。也就是说,如果A< B和B< C,那么它也必须保持A<下进行。

您无法使用常规排序功能指定部分订单并获取已排序的输出。相反,您需要实现topological sort

这是一个简单的数据实现(除了我删除了重复的"密码"条目)。

package main

import "fmt"

type Stmt struct {
    Name  string
    After []string
}

func topSort(ss []Stmt) []string {
    after := map[string][]string{} // things that must come after
    counts := map[string]int{}     // number unsatified preconditions
    zc := map[string]struct{}{}    // things with zero count
    for _, s := range ss {
        for _, a := range s.After {
            after[a] = append(after[a], s.Name)
        }
        counts[s.Name] = len(s.After)
        if len(s.After) == 0 {
            zc[s.Name] = struct{}{}
        }
    }

    r := []string{}
    for len(zc) > 0 {
        for n := range zc {
            r = append(r, n)
            for _, a := range after[n] {
                counts[a]--
                if counts[a] == 0 {
                    zc[a] = struct{}{}
                }
            }
            delete(zc, n)
        }
    }
    return r
}

func main() {
    stmts := []Stmt{
        {Name: "app", After: []string{"app_user"}},
        {Name: "billingplan", After: []string{}},
        {Name: "campaign", After: []string{"app_user"}},
        {Name: "campaign_app", After: []string{"campaign", "app"}},
        {Name: "campaign_ip", After: []string{"campaign", "ip"}},
        {Name: "campaign_operator", After: []string{"campaign", "operator"}},
        {Name: "campaign_sponsor", After: []string{"campaign", "sponsor"}},
        {Name: "campaign_subscriberfilter", After: []string{"campaign", "subscriber_filters"}},
        {Name: "campaign_url", After: []string{"campaign", "url"}},
        {Name: "contentpartner", After: []string{"app_user"}},
        {Name: "filter_criteria", After: []string{"campaign", "subscriber_filters"}},
        {Name: "ip", After: []string{"app_user"}},
        {Name: "mobile_registered", After: []string{"campaign", "app"}},
        {Name: "operator", After: []string{}},
        {Name: "passwords", After: []string{"app_user"}},
        {Name: "publish_package", After: []string{}},
        {Name: "role", After: []string{}},
        {Name: "sponsor", After: []string{"app_user"}},
        {Name: "subscriber_dbs", After: []string{}},
        {Name: "subscriber_filters", After: []string{"subscriber_dbs"}},
        {Name: "timezone", After: []string{}},
        {Name: "url", After: []string{"app_user"}},
        {Name: "app_user", After: []string{}},
        {Name: "user_role", After: []string{"app_user", "role"}},
    }
    r := topSort(stmts)
    for _, s := range r {
        fmt.Println(s)
    }
}

答案 1 :(得分:1)

如匿名所述,您需要进行拓扑排序。 Tarjan's strongly connected components algorithm具有以反向拓扑排序顺序返回SCC的属性。这意味着它可以用作拓扑排序算法。

以下是基于维基百科上的伪代码实现Tarjan的算法(可运行here,最初由我posted执行golang-nuts列表)(更常见的实现{{3但是使用基本相同的底层代码):

package main

import (
    "fmt"
    "log"
)

type Stmt struct {
    Name  string
    After []string
}

func main() {
    stmts := []Stmt{
        {Name: "app", After: []string{"app_user"}},
        {Name: "billingplan", After: []string{}},
        {Name: "campaign", After: []string{"app_user"}},
        {Name: "campaign_app", After: []string{"campaign", "app"}},
        {Name: "campaign_ip", After: []string{"campaign", "ip"}},
        {Name: "campaign_operator", After: []string{"campaign", "operator"}},
        {Name: "campaign_sponsor", After: []string{"campaign", "sponsor"}},
        {Name: "campaign_subscriberfilter", After: []string{"campaign", "subscriber_filters"}},
        {Name: "campaign_url", After: []string{"campaign", "url"}},
        {Name: "contentpartner", After: []string{"app_user"}},
        {Name: "filter_criteria", After: []string{"campaign", "subscriber_filters"}},
        {Name: "ip", After: []string{"app_user"}},
        {Name: "mobile_registered", After: []string{"campaign", "app"}},
        {Name: "operator", After: []string{}},
        {Name: "passwords", After: []string{"app_user"}},
        {Name: "publish_package", After: []string{}},
        {Name: "role", After: []string{}},
        {Name: "passwords", After: []string{"app_user"}},
        {Name: "sponsor", After: []string{"app_user"}},
        {Name: "subscriber_dbs", After: []string{}},
        {Name: "subscriber_filters", After: []string{"subscriber_dbs"}},
        {Name: "timezone", After: []string{}},
        {Name: "url", After: []string{"app_user"}},
        {Name: "app_user", After: []string{}},
        {Name: "user_role", After: []string{"app_user", "role"}},
    }

    g := make(graph)
    for _, s := range stmts {
        g[s.Name] = after(s.After)
    }

    sorted, err := topoSort(g)
    if err != nil {
        log.Fatalf("could not sort: %v", err)
    }
    for _, s := range sorted {
        fmt.Println(s)
    }
}

func topoSort(g graph) ([]string, error) {
    sccs := tarjanSCC(g)
    sorted := make([]string, len(sccs))
    for i, s := range sccs {
        if len(s) != 1 {
            return nil, fmt.Errorf("found directed cycle: %q", s)
        }
        sorted[i] = s[0]
    }
    return sorted, nil
}

// graph is an edge list representation of a directed graph.
type graph map[string]set

// set is an string set.
type set map[string]struct{}

func after(i []string) set {
    if len(i) == 0 {
        return nil
    }
    s := make(set)
    for _, v := range i {
        s[v] = struct{}{}
    }
    return s
}

// tarjanSCC returns a the strongly connected components of the
// directed graph g.
func tarjanSCC(g graph) [][]string {
    t := tarjan{
        g: g,

        indexTable: make(map[string]int, len(g)),
        lowLink:    make(map[string]int, len(g)),
        onStack:    make(map[string]bool, len(g)),
    }
    for v := range t.g {
        if t.indexTable[v] == 0 {
            t.strongconnect(v)
        }
    }
    return t.sccs
}

// tarjan implements Tarjan's strongly connected component finding
// algorithm. The implementation is from the pseudocode at
//
// http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm
//
type tarjan struct {
    g graph

    index      int
    indexTable map[string]int
    lowLink    map[string]int
    onStack    map[string]bool

    stack []string

    sccs [][]string
}

// strongconnect is the strongconnect function described in the
// wikipedia article.
func (t *tarjan) strongconnect(v string) {
    // Set the depth index for v to the smallest unused index.
    t.index++
    t.indexTable[v] = t.index
    t.lowLink[v] = t.index
    t.stack = append(t.stack, v)
    t.onStack[v] = true

    // Consider successors of v.
    for w := range t.g[v] {
        if t.indexTable[w] == 0 {
            // Successor w has not yet been visited; recur on it.
            t.strongconnect(w)
            t.lowLink[v] = min(t.lowLink[v], t.lowLink[w])
        } else if t.onStack[w] {
            // Successor w is in stack s and hence in the current SCC.
            t.lowLink[v] = min(t.lowLink[v], t.indexTable[w])
        }
    }

    // If v is a root node, pop the stack and generate an SCC.
    if t.lowLink[v] == t.indexTable[v] {
        // Start a new strongly connected component.
        var (
            scc []string
            w   string
        )
        for {
            w, t.stack = t.stack[len(t.stack)-1], t.stack[:len(t.stack)-1]
            t.onStack[w] = false
            // Add w to current strongly connected component.
            scc = append(scc, w)
            if w == v {
                break
            }
        }
        // Output the current strongly connected component.
        t.sccs = append(t.sccs, scc)
    }
}

func min(a, b int) int {
    if a < b {
        return a
    }
    return b
}

请注意,重复运行此代码不会导致相同的输出严格排序,因为许多路径相对于彼此无法明确订购(由于结果已缓存,因此无法在操场中显示 - 你可以通过将调用包装到tarjanSCC来看到这一点。

虽然通过使用Tarjan的SCC算法直接实现拓扑排序可能更容易,但我们能够找到排序失败的原因,例如here(参见相同的数据) here)。

答案 2 :(得分:0)

这个问题在这里得到了解答:https://groups.google.com/forum/#!topic/golang-nuts/C_7JY1f3cSc。这是我能够立即使用的更详细和具体的答案。请求此人在此处更新,但他/她没有......所以自己动手。


因为我有一个tarjan躺在身边。这是拓扑排序 数据:

http://play.golang.org/p/SHagFMvuhl

在星期四,2015-03-12 22:47 -0700,Sathish VJ写道:

  

在我的情况下,我做了检查,并且在那里有传递性质   依赖。

     

我打印出调试语句时注意到的一件具体事情是   有些比较永远不会发生。例如,user_role是   永远不会与现在的角色相提并论。虽然在某一时刻   它的元素较少。

sort.Sort不会进行所有比较。这将导致O(n ^ 2) 时间。