我正在Azure上运行Ubuntu VM。 2天前我的服务器坏了。我在我的系统日志中找到了这个:
Dec 11 06:45:28 myservice kernel: [4525694.437314] INFO: task nginx:22992 blocked for more than 120 seconds.
Dec 11 06:45:28 myservice kernel: [4525694.442895] Not tainted 3.16.0-29-generic #39-Ubuntu
Dec 11 06:45:28 myservice kernel: [4525694.447905] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 11 06:45:28 myservice kernel: [4525694.453525] nginx D ffff8801bb633840 0 22992 22990 0x00000000
Dec 11 06:45:28 myservice kernel: [4525694.453531] ffff8801a0a7bd60 0000000000000082 ffff8801a0ebf010 0000000000013840
Dec 11 06:45:28 myservice kernel: [4525694.453534] ffff8801a0a7bfd8 0000000000013840 ffff8801a0ebf010 ffff8801b88d8d10
Dec 11 06:45:28 myservice kernel: [4525694.453536] ffff8801b88d8d14 ffff8801a0ebf010 00000000ffffffff ffff8801b88d8d18
Dec 11 06:45:28 myservice kernel: [4525694.453539] Call Trace:
Dec 11 06:45:28 myservice kernel: [4525694.453547] [<ffffffff817858a9>] schedule_preempt_disabled+0x29/0x70
Dec 11 06:45:28 myservice kernel: [4525694.453551] [<ffffffff81787e45>] __mutex_lock_slowpath+0xd5/0x1f0
Dec 11 06:45:28 myservice kernel: [4525694.453562] [<ffffffff81787f7f>] mutex_lock+0x1f/0x30
Dec 11 06:45:28 myservice kernel: [4525694.453580] [<ffffffffc048fe90>] cifs_strict_writev+0xf0/0x250 [cifs]
Dec 11 06:45:28 myservice kernel: [4525694.453585] [<ffffffff811e0991>] new_sync_write+0x81/0xb0
Dec 11 06:45:28 myservice kernel: [4525694.453588] [<ffffffff811e1177>] vfs_write+0xb7/0x1f0
Dec 11 06:45:28 myservice kernel: [4525694.453592] [<ffffffff811ffdcb>] ? set_close_on_exec+0x4b/0x60
Dec 11 06:45:28 myservice kernel: [4525694.453595] [<ffffffff811e1d26>] SyS_write+0x46/0xb0
Dec 11 06:45:28 myservice kernel: [4525694.453598] [<ffffffff8178a1ad>] system_call_fastpath+0x1a/0x1f
“谷歌”告诉我,它可能已经......与高磁盘I / O速率有关。但我的Azure监控显示,在有问题的时间范围内,我的磁盘读/写值非常低。 CPU占用率低,内存使用率低。
对此问题的另一个猜测是硬件故障。 我如何检查这是否真的是原因 - 如果是:我的VM在云中时如何解决这个问题?迁移到新VM?!
我还有一个非常古老的nginx版本,我想更新 - 但我不认为这就是这个问题的原因,是吗?