我已经编写了一个SSH客户端来连接网络设备,一旦运行命令超过25秒,我就通过“select”设置超时。我注意到一些设备,他们有另一个IOS,一旦超时被触发,它就不能通过Close()方法丢弃SSH会话,并导致goroutinge泄漏。我需要保持客户端并断开会话以准备下一个命令。看起来goroutine在那个时候永远不会终止!你们有什么想法吗?
go func() {
r <- s.Run(cmd)
}()
select {
case err := <-r:
return err
case <-time.After(time.Duration(timeout) * time.Second):
s.Close()
return fmt.Errorf("timeout after %d seconds", timeout)
}
通过堆分析,我看到了以下内容: 2.77GB 99.44%99.44%2.77GB 99.44%bytes.makeSlice
0 0% 99.44% 2.77GB 99.44% bytes.(*Buffer).ReadFrom
0 0% 99.44% 2.77GB 99.44% golang.org/x/crypto/ssh.(*Session).start.func1
0 0% 99.44% 2.77GB 99.44% golang.org/x/crypto/ssh.(*Session).stdout.func1
0 0% 99.44% 2.77GB 99.44% io.Copy
0 0% 99.44% 2.77GB 99.44% io.copyBuffer
0 0% 99.44% 2.78GB 99.93% runtime.goexit
ROUTINE ======================== runtime.goexit在/usr/local/go/src/runtime/asm_amd64.s
0 2.78GB (flat, cum) 99.93% of Total
. . 1993: RET
. . 1994:
. . 1995:// The top-most function running on a goroutine
. . 1996:// returns to goexit+PCQuantum.
. . 1997:TEXT runtime·goexit(SB),NOSPLIT,$0-0
. 2.78GB 1998: BYTE $0x90 // NOP
. . 1999: CALL runtime·goexit1(SB) // does not return
. . 2000: // traceback from goexit1 must hit code range of goexit
. . 2001: BYTE $0x90 // NOP
. . 2002:
. . 2003:TEXT runtime·prefetcht0(SB),NOSPLIT,$0-8
答案 0 :(得分:0)
通道r
阻止Go例程返回,因为它没有被清空。我已经编写了代码的改编版本,并插入了一个Wait组来演示该问题:
func main() {
var wg sync.WaitGroup // This is only added for demonstration purposes
s := new(clientSession)
r := make(chan error)
go func(s *clientSession) {
wg.Add(1)
r <- s.Run()
wg.Done() // Will only be called after s.Run() is able to return
}(s)
fmt.Println("Client has been opened")
select {
case err := <-r:
fmt.Println(err)
case <-time.After(1 * time.Second):
s.Close()
fmt.Println("Timed out, closing")
}
wg.Wait() // Waits until wg.Done() is called.
fmt.Println("Main finished successfully")
}
去游乐场似乎终止了该程序,所以我用完整的可运行代码创建了一个gist。当我们运行incorrect.go
时:
$ go run incorrect.go
Client has been opened
Timed out, closing
fatal error: all goroutines are asleep - deadlock!
....
那是因为我们的代码在wg.Wait()
行上死锁了。这表明Go例程中的wg.Done()
从未达到。
正如评论所指出的,缓冲通道可以在这里提供帮助。但是只有在您不再关心该错误之后,才调用s.Close()
r := make(chan error, 1)
buffered.go
运行正常,但错误丢失:
$ go run buffered.go
Client has been opened
Timed out, closing
Main finished successfully
另一种选择是将频道排干1次:
select {
case err := <-r:
fmt.Println(err)
case <-time.After(1 * time.Second):
s.Close()
fmt.Println("Timed out, closing")
fmt.Println(<-r)
}
或通过将select
包装在for
循环中(无缓冲通道):
X:
for {
select {
case err := <-r:
fmt.Println(err)
break X // because we are in main(). Normally `return err`
case <-time.After(1 * time.Second):
s.Close()
fmt.Println("Timed out, closing")
}
}
运行drain.go
时,我们还会看到错误消息:
$ go run incorrect.go
Client has been opened
Timed out, closing
Run() closed
Main finished successfully
在现实世界中,可能会运行多个Go例程。因此,您将需要在for
循环上使用一些计数器,或者进一步利用等待组功能。