如何以编程方式使用MongoDB预拆分基于GUID的分片密钥

时间:2013-10-30 01:35:42

标签: mongodb sharding

假设我使用的是一个相当标准的32个字符的十六进制GUID,我已经确定,因为它是为我的用户随机生成的,所以它非常适合用作分片键来水平缩放写入MongoDB集合,我将存储用户信息(并写入缩放是我的主要关注点)。

我也知道我需要从至少4个分片开始,因为流量预测和一些测试环境完成的基准工作。

最后,我对初始数据大小(平均文档大小*初始用户数)有一个不错的想法 - 大约120GB左右。

我想让初始加载更好,更快,并尽可能多地利用所有4个分片。如何预分割这些数据,以便利用4个分片,并最大限度地减少初始数据加载期间需要在分片上发生的移动,分割等数量?

1 个答案:

答案 0 :(得分:24)

我们知道初始数据大小(120GB),我们知道MongoDB is 64MB中的默认最大块大小。如果我们将64MB划分为120GB,我们得到1920 - 所以这是我们应该开始看的最小块数。事实上,2048恰好是16的幂除以2,并且鉴于GUID(我们的分片键)是基于十六进制的,这比1920更容易处理(见下文)。

注意:必须在将任何数据添加到集合之前完成此预分割。如果对包含数据的集合使用enableSharding()命令,MongoDB将自行拆分数据,然后在块已经存在的情况下运行它 - 这可能导致非常奇怪的块分发,所以要小心。

出于本答案的目的,我们假设数据库将被称为users,并且该集合被称为userInfo。我们还假设GUID将写入_id字段。使用这些参数,我们将连接到mongos并运行以下命令:

// first switch to the users DB
use users;
// now enable sharding for the users DB
sh.enableSharding("users"); 
// enable sharding on the relevant collection
sh.shardCollection("users.userInfo", {"_id" : 1});
// finally, disable the balancer (see below for options on a per-collection basis)
// this prevents migrations from kicking off and interfering with the splits by competing for meta data locks
sh.stopBalancer(); 

现在,根据上面的计算,我们需要将GUID范围拆分为2048个块。为此,我们需要至少3个十六进制数字(16 ^ 3 = 4096),并且我们将它们放在范围的最高有效数字(即最左边的3)中。同样,这应该从mongos shell

运行
// Simply use a for loop for each digit
for ( var x=0; x < 16; x++ ){
  for( var y=0; y<16; y++ ) {
  // for the innermost loop we will increment by 2 to get 2048 total iterations
  // make this z++ for 4096 - that would give ~30MB chunks based on the original figures
    for ( var z=0; z<16; z+=2 ) {
    // now construct the GUID with zeroes for padding - handily the toString method takes an argument to specify the base
        var prefix = "" + x.toString(16) + y.toString(16) + z.toString(16) + "00000000000000000000000000000";
        // finally, use the split command to create the appropriate chunk
        db.adminCommand( { split : "users.userInfo" , middle : { _id : prefix } } );
    }
  }
}

完成后,让我们使用sh.status()帮助程序检查游戏状态:

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "version" : 3,
        "minCompatibleVersion" : 3,
        "currentVersion" : 4,
        "clusterId" : ObjectId("527056b8f6985e1bcce4c4cb")
}
  shards:
        {  "_id" : "shard0000",  "host" : "localhost:30000" }
        {  "_id" : "shard0001",  "host" : "localhost:30001" }
        {  "_id" : "shard0002",  "host" : "localhost:30002" }
        {  "_id" : "shard0003",  "host" : "localhost:30003" }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "users",  "partitioned" : true,  "primary" : "shard0001" }
                users.userInfo
                        shard key: { "_id" : 1 }
                        chunks:
                                shard0001       2049
                        too many chunks to print, use verbose if you want to force print

我们有2048个块(加上一个额外的感谢最小/最大块),但由于平衡器关闭,它们仍然在原始碎片上。所以,让我们重新启用平衡器:

sh.startBalancer();

这将立即开始平衡,并且它将相对较快,因为所有块都是空的,但它仍然需要一段时间(如果它与来自其他集合的迁移竞争则慢得多)。一段时间过后,再次运行sh.status()并在那里(应该)拥有它 - 所有2048个块在4个分片中很好地分开并准备好进行初始数据加载:

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "version" : 3,
        "minCompatibleVersion" : 3,
        "currentVersion" : 4,
        "clusterId" : ObjectId("527056b8f6985e1bcce4c4cb")
}
  shards:
        {  "_id" : "shard0000",  "host" : "localhost:30000" }
        {  "_id" : "shard0001",  "host" : "localhost:30001" }
        {  "_id" : "shard0002",  "host" : "localhost:30002" }
        {  "_id" : "shard0003",  "host" : "localhost:30003" }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "users",  "partitioned" : true,  "primary" : "shard0001" }
                users.userInfo
                        shard key: { "_id" : 1 }
                        chunks:
                                shard0000       512
                                shard0002       512
                                shard0003       512
                                shard0001       513
                        too many chunks to print, use verbose if you want to force print
        {  "_id" : "test",  "partitioned" : false,  "primary" : "shard0002" }

您现在已准备好开始加载数据,但为了绝对保证在数据加载完成之前不会发生拆分或迁移,您需要再做一件事 - 关闭平衡器并在导入期间自动拆分:

  • 要禁用所有平衡,请从mongos:sh.stopBalancer()
  • 运行此命令
  • 如果要让其他平衡操作继续运行,可以禁用特定集合。以上面的命名空间为例:sh.disableBalancing("users.userInfo")
  • 要在加载期间关闭自动拆分,您需要重新启动每个mongos,以便使用--noAutoSplit选项加载数据。

导入完成后,根据需要反转步骤(sh.startBalancer()sh.enableBalancing("users.userInfo"),然后重新启动mongos而不--noAutoSplit),将所有内容恢复为默认设置。

**

更新:优化速度

**

如果您不赶时间,上述方法就可以了。事实上,正如您将发现的那样,如果您测试一下,平衡器的速度也不是很快 - 即使是空的块。因此,当您增加创建的块数时,需要花费的时间越长。我已经看到完成2048个块的平衡需要30多分钟,但这取决于部署情况。

对于测试或者对于相对安静的群集来说可能没问题,但是如果在繁忙的群集上确保平衡器关闭并且不需要其他更新干扰将会更加困难。那么,我们如何加快速度呢?

答案是尽早进行一些手动移动,然后在它们各自的分片上分割块。请注意,这仅适用于某些分片键(如随机分布的UUID)或某些数据访问模式,因此请注意不要因此导致数据分发不佳。

使用上面的例子,我们有4个分片,所以不是做所有的分割,而是平衡,我们分成4个。然后我们通过手动移动它们在每个碎片上放置一个块,然后最后我们将这些块拆分成所需的数量。

上例中的范围如下所示:

$min --> "40000000000000000000000000000000"
"40000000000000000000000000000000" --> "80000000000000000000000000000000"
"80000000000000000000000000000000" --> "c0000000000000000000000000000000"
"c0000000000000000000000000000000" --> $max     

创建这些命令只有4个命令,但是既然我们拥有它,为什么不以简化/修改形式重复使用上面的循环:

for ( var x=4; x < 16; x+=4){
    var prefix = "" + x.toString(16) + "0000000000000000000000000000000";
    db.adminCommand( { split : "users.userInfo" , middle : { _id : prefix } } ); 
} 

以下是思考现在的样子 - 我们有4个块,全都在shard0001上:

mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "version" : 4,
    "minCompatibleVersion" : 4,
    "currentVersion" : 5,
    "clusterId" : ObjectId("53467e59aea36af7b82a75c1")
}
  shards:
    {  "_id" : "shard0000",  "host" : "localhost:30000" }
    {  "_id" : "shard0001",  "host" : "localhost:30001" }
    {  "_id" : "shard0002",  "host" : "localhost:30002" }
    {  "_id" : "shard0003",  "host" : "localhost:30003" }
  databases:
    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
    {  "_id" : "test",  "partitioned" : false,  "primary" : "shard0001" }
    {  "_id" : "users",  "partitioned" : true,  "primary" : "shard0001" }
        users.userInfo
            shard key: { "_id" : 1 }
            chunks:
                shard0001   4
            { "_id" : { "$minKey" : 1 } } -->> { "_id" : "40000000000000000000000000000000" } on : shard0001 Timestamp(1, 1) 
            { "_id" : "40000000000000000000000000000000" } -->> { "_id" : "80000000000000000000000000000000" } on : shard0001 Timestamp(1, 3) 
            { "_id" : "80000000000000000000000000000000" } -->> { "_id" : "c0000000000000000000000000000000" } on : shard0001 Timestamp(1, 5) 
            { "_id" : "c0000000000000000000000000000000" } -->> { "_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 6)                    

我们将$min块放在原处,然后移动其他三块。您可以以编程方式执行此操作,但它确实取决于块最初位于何处,如何命名您的分片等等。所以我现在将保留此手册,它不是​​太繁琐 - 仅仅3 moveChunk命令:< / p>

mongos> sh.moveChunk("users.userInfo", {"_id" : "40000000000000000000000000000000"}, "shard0000")
{ "millis" : 1091, "ok" : 1 }
mongos> sh.moveChunk("users.userInfo", {"_id" : "80000000000000000000000000000000"}, "shard0002")
{ "millis" : 1078, "ok" : 1 }
mongos> sh.moveChunk("users.userInfo", {"_id" : "c0000000000000000000000000000000"}, "shard0003")
{ "millis" : 1083, "ok" : 1 }          

让我们仔细检查,并确保这些块是我们期望的那样:

mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "version" : 4,
    "minCompatibleVersion" : 4,
    "currentVersion" : 5,
    "clusterId" : ObjectId("53467e59aea36af7b82a75c1")
}
  shards:
    {  "_id" : "shard0000",  "host" : "localhost:30000" }
    {  "_id" : "shard0001",  "host" : "localhost:30001" }
    {  "_id" : "shard0002",  "host" : "localhost:30002" }
    {  "_id" : "shard0003",  "host" : "localhost:30003" }
  databases:
    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
    {  "_id" : "test",  "partitioned" : false,  "primary" : "shard0001" }
    {  "_id" : "users",  "partitioned" : true,  "primary" : "shard0001" }
        users.userInfo
            shard key: { "_id" : 1 }
            chunks:
                shard0001   1
                shard0000   1
                shard0002   1
                shard0003   1
            { "_id" : { "$minKey" : 1 } } -->> { "_id" : "40000000000000000000000000000000" } on : shard0001 Timestamp(4, 1) 
            { "_id" : "40000000000000000000000000000000" } -->> { "_id" : "80000000000000000000000000000000" } on : shard0000 Timestamp(2, 0) 
            { "_id" : "80000000000000000000000000000000" } -->> { "_id" : "c0000000000000000000000000000000" } on : shard0002 Timestamp(3, 0) 
            { "_id" : "c0000000000000000000000000000000" } -->> { "_id" : { "$maxKey" : 1 } } on : shard0003 Timestamp(4, 0)  

这符合我们上面建议的范围,所以看起来都很好。现在运行上面的原始循环将它们分开&#34;到位&#34;在每个碎片上,一旦循环完成,我们应该有一个平衡的分布。还有一个sh.status()应该确认:

mongos> for ( var x=0; x < 16; x++ ){
...   for( var y=0; y<16; y++ ) {
...   // for the innermost loop we will increment by 2 to get 2048 total iterations
...   // make this z++ for 4096 - that would give ~30MB chunks based on the original figures
...     for ( var z=0; z<16; z+=2 ) {
...     // now construct the GUID with zeroes for padding - handily the toString method takes an argument to specify the base
...         var prefix = "" + x.toString(16) + y.toString(16) + z.toString(16) + "00000000000000000000000000000";
...         // finally, use the split command to create the appropriate chunk
...         db.adminCommand( { split : "users.userInfo" , middle : { _id : prefix } } );
...     }
...   }
... }          
{ "ok" : 1 }
mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "version" : 4,
    "minCompatibleVersion" : 4,
    "currentVersion" : 5,
    "clusterId" : ObjectId("53467e59aea36af7b82a75c1")
}
  shards:
    {  "_id" : "shard0000",  "host" : "localhost:30000" }
    {  "_id" : "shard0001",  "host" : "localhost:30001" }
    {  "_id" : "shard0002",  "host" : "localhost:30002" }
    {  "_id" : "shard0003",  "host" : "localhost:30003" }
  databases:
    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
    {  "_id" : "test",  "partitioned" : false,  "primary" : "shard0001" }
    {  "_id" : "users",  "partitioned" : true,  "primary" : "shard0001" }
        users.userInfo
            shard key: { "_id" : 1 }
            chunks:
                shard0001   513
                shard0000   512
                shard0002   512
                shard0003   512
            too many chunks to print, use verbose if you want to force print    

而且你有它 - 没有等待平衡器,分布已经是均匀的。