没有恐慌的goroutine中断是否有可能?

时间:2019-05-14 13:26:38

标签: go

我正在设置服务,提供http服务器并运行goroutine处理某些工作,查看代码

一次循环,一个子作业看起来像是中断,一次func调用后没有日志

它没有遇到任何紧急错误,并且由于互斥锁未解锁,延迟似乎未触发

日志未中断或丢失 其他作业的日志已完成 在那段时间没有重启,退出或oom kill

这是针对CentOS 7.5的,我的服务在docker中运行

go1.11 码头工人18.09

这是一个偶然的错误,我添加了更多日志并打开pprof,然后尝试重现此错误

main.go

func main() {
    ....
    // this is a cycle job, with custom time intervals
    router.Cycle(r)
    ....
    endless.ListenAndServe(":"+conf.Conf.Port, r)
}

router / cycle.go

// this is a loop job, when job end, sleep custom time intervals and run again
// implemented by encapsulating a goroutine, and create a context 
func Cycle(g *gin.Engine) {
    cyclec := cli.InitCycle(g)
    cyclec.AddFunc(time.Second, schedule.RunSomeDeal)
    cyclec.Start()
}

///RunSomeDeal
func RunSomeDeal(c *gin.Context) error {
    ...
    // deal some sub job
    for i := 0; i < missionLen; i++ {
                // this is once job, like cycle but only run once
                // a new context is generated by passing the exist context and a goroutine executes the callback function
        helpers.Job.Run(c, func(newCtx *gin.Context) error {
                        return DealMission(newCtx, someparams...)
        })
    }
    return nil
}

// Job.Run
func (c *Job) Run(ctx *gin.Context, f func(ctx *gin.Context) error) {
    e := &Entry{
        Job: FuncJob(f),
    }

    if c.getJobContext != nil {
        e.span = c.getJobContext(ctx)
    }
    go c.runWithRecovery(e)
}

func (c *Job) runWithRecovery(e *Entry) {
    ctx := gin.CreateNewContext(c.gin)
    ...
    defer func() {
        if r := recover(); r != nil {
            const size = 64 << 10
            buf := make([]byte, size)
            buf = buf[:runtime.Stack(buf, false)]

            requestId, _ := ctx.Get("requestId")
            handleName := ctx.CustomContext.HandlerName()
            info, _ := json.Marshal(map[string]interface{}{
                ...some kv for log
            })
            log.Printf(...)
        }
        gin.RecycleContext(c.gin, ctx)
    }()

    if c.beforeRun != nil {
        ok := c.beforeRun(ctx, e.span)
        if !ok {
            return
        }
    }

    error := e.Job.Run(ctx)
    ...
    if c.afterRun != nil {
        c.afterRun(ctx)
    }
}


// DealMission

func DealMission(c *gin.Context, params...) {
    // lock something use sync.mutex
    doSomeLock()

    defer func() {
        // ...not trigger
        unlockErr := unlockxxxxx(...)
        if unlockErr != nil {
            panic("some error info")
        }
    } ()

    base.DebugLog(...)
    err := SomeOtherFunc(c, params...)
    base.DebugLog(...)
}

// some other func
func SomeOtherFunc(ctx *gin.Context, params...) error {
    err := CallOther()
    base.DebugLog(...)

    err := CallOther()
    base.DebugLog(...)

    //  there is no logs after this call func, and Job.runWithRecovery not catch any panic error
    err := CallOther()
    // print log...  
    base.DebugLog(...)
}

在此子作业中,日志停在某一行,并且没有发生紧急情况,没有错误,并且由于互斥锁未解锁,延迟似乎未触发

其他作业的日志也很好,下一个周期作业的日志也很好

0 个答案:

没有答案