担心在生产MongoDB上运行db.stats()时出现无效的BSONObj错误

时间:2013-02-02 00:43:50

标签: mongodb bson

当我们运行db.stats()时,我们的生产数据库(64位debian上的2.2.1)会抛出以下错误:

> db.stats()
{
    "errmsg" : "exception: Invalid BSONObj size: 0 (0x00000000) first element: EOO",
    "code" : 10334,
    "ok" : 0
}

以下内容出现在我们的日志中:

Fri Feb  1 16:28:46 [conn4081] Assertion: 10334:Invalid BSONObj size: 0 (0x00000000) first element: EOO
0xaf8c41 0xabedb9 0xabef3c 0x571fb7 0x6e880d 0x6f6411 0x6e8321 0x6e9cb0 0x6eab4c 0x830028 0x83376b 0x7b0b0d 0x7b20e2 0x56fe42 0xae6ed1 0x7fe7645378ba 0x7fe7638eb02d
 /opt/mongodb/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xaf8c41]
 /opt/mongodb/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x99) [0xabedb9]
 /opt/mongodb/bin/mongod() [0xabef3c]
 /opt/mongodb/bin/mongod(_ZNK5mongo7BSONObj14_assertInvalidEv+0x497) [0x571fb7]
 /opt/mongodb/bin/mongod() [0x6e880d]
 /opt/mongodb/bin/mongod(_ZN5mongo7DBStats3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x321) [0x6f6411]
 /opt/mongodb/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRNS_14BSONObjBuilderEb+0x51) [0x6e8321]
 /opt/mongodb/bin/mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xe70) [0x6e9cb0]
 /opt/mongodb/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x2ac) [0x6eab4c]
 /opt/mongodb/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x38) [0x830028]
 /opt/mongodb/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0xc0b) [0x83376b]
 /opt/mongodb/bin/mongod() [0x7b0b0d]
 /opt/mongodb/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x3a2) [0x7b20e2]
 /opt/mongodb/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x82) [0x56fe42]
 /opt/mongodb/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x411) [0xae6ed1]
 /lib/libpthread.so.0(+0x68ba) [0x7fe7645378ba]
 /lib/libc.so.6(clone+0x6d) [0x7fe7638eb02d]

我们对此非常关注。有任何想法吗?我在网上找到的所有内容都是旧的,没有定论。

以下是一些更多信息:

> db.serverStatus()
{
    "host" : "hellboy",
    "version" : "2.2.1",
    "process" : "mongod",
    "pid" : 1432,
    "uptime" : 2672006,
    "uptimeMillis" : NumberLong("2672006216"),
    "uptimeEstimate" : 2626689,
    "localTime" : ISODate("2013-02-01T23:03:16.304Z"),
    "locks" : {
        "." : {
            "timeLockedMicros" : {
                "R" : NumberLong(333024517),
                "W" : NumberLong("54808066962")
            },
            "timeAcquiringMicros" : {
                "R" : NumberLong("30969573082"),
                "W" : NumberLong("4107434021")
            }
        },
        "admin" : {
            "timeLockedMicros" : {
                "r" : NumberLong(5942684),
                "w" : NumberLong(0)
            },
            "timeAcquiringMicros" : {
                "r" : NumberLong(48432),
                "w" : NumberLong(0)
            }
        },
        "local" : {
            "timeLockedMicros" : {
                "r" : NumberLong(1109128),
                "w" : NumberLong(0)
            },
            "timeAcquiringMicros" : {
                "r" : NumberLong(82283399),
                "w" : NumberLong(0)
            }
        },
        "gc" : {
            "timeLockedMicros" : {
                "r" : NumberLong("171460799918"),
                "w" : NumberLong("171384959016")
            },
            "timeAcquiringMicros" : {
                "r" : NumberLong("1816006512260"),
                "w" : NumberLong("3169374123999")
            }
        }
    },
    "globalLock" : {
        "totalTime" : NumberLong("2672006216000"),
        "lockTime" : NumberLong("54808066962"),
        "currentQueue" : {
            "total" : 0,
            "readers" : 0,
            "writers" : 0
        },
        "activeClients" : {
            "total" : 0,
            "readers" : 0,
            "writers" : 0
        }
    },
    "mem" : {
        "bits" : 64,
        "resident" : 4212,
        "virtual" : 443165,
        "supported" : true,
        "mapped" : 221237,
        "mappedWithJournal" : 442474
    },
    "connections" : {
        "current" : 364,
        "available" : 455
    },
    "extra_info" : {
        "note" : "fields vary by platform",
        "heap_usage_bytes" : 77840056,
        "page_faults" : 15189196
    },
    "indexCounters" : {
        "btree" : {
            "accesses" : 1490093267,
            "hits" : 1490093267,
            "misses" : 0,
            "resets" : 0,
            "missRatio" : 0
        }
    },
    "backgroundFlushing" : {
        "flushes" : 36144,
        "total_ms" : 614413596,
        "average_ms" : 16999.048140770254,
        "last_ms" : 352,
        "last_finished" : ISODate("2013-02-01T23:02:43.221Z")
    },
    "cursors" : {
        "totalOpen" : 5,
        "clientCursors_size" : 5,
        "timedOut" : 3,
        "totalNoTimeout" : 5
    },
    "network" : {
        "bytesIn" : 53731292608,
        "bytesOut" : NumberLong("2215346701908"),
        "numRequests" : 264535004
    },
    "opcounters" : {
        "insert" : 83515158,
        "query" : 141076950,
        "update" : 21415981,
        "delete" : 98,
        "getmore" : 685956,
        "command" : 18499441
    },
    "asserts" : {
        "regular" : 0,
        "warning" : 57,
        "msg" : 0,
        "user" : 0,
        "rollovers" : 0
    },
    "writeBacksQueued" : false,
    "dur" : {
        "commits" : 30,
        "journaledMB" : 0,
        "writeToDataFilesMB" : 0,
        "compression" : 0,
        "commitsInWriteLock" : 0,
        "earlyCommits" : 0,
        "timeMs" : {
            "dt" : 3074,
            "prepLogBuffer" : 0,
            "writeToJournal" : 0,
            "writeToDataFiles" : 0,
            "remapPrivateView" : 0
        }
    },
    "recordStats" : {
        "accessesNotInMemory" : 3070244,
        "pageFaultExceptionsThrown" : 1124345,
        "admin" : {
            "accessesNotInMemory" : 0,
            "pageFaultExceptionsThrown" : 0
        },
        "gc" : {
            "accessesNotInMemory" : 3070244,
            "pageFaultExceptionsThrown" : 1124345
        },
        "local" : {
            "accessesNotInMemory" : 0,
            "pageFaultExceptionsThrown" : 0
        }
    },
    "ok" : 1
}

3 个答案:

答案 0 :(得分:5)

我们的副本集和执行使用local有同样的问题; db.repairDatabase()清除了问题

答案 1 :(得分:4)

最终

mongodump --repair --dbpath /data/db /path/to/dump

设法生成我们稍后用于重新创建数据库的转储。错误消失了。这意味着一些停机时间,但现在我们可以提高我们的副本集而不必担心复制损坏的数据库。

答案 2 :(得分:3)

您是作为副本集运行吗?如果是这样,运行时会发生什么:

  

使用本地   db.repairDatabase()

如果您收到类似的无效BSONObj错误,则可能会出现损坏的oplog。

如果是这种情况,您将需要重新构建它:

1)对于副本集中的所有其他节点:     - 停止节点     - 删除“本地”目录 2)在你想要的小学:     - 删除“本地”目录    - 启动它     - 运行rs.initiate()