MongoDB Map& Reduce比MySQL group慢得多

时间:2012-07-17 11:20:40

标签: mysql mongodb mapreduce

我正在尝试评估要用于新项目的数据库系统。

目前我将MySQL和MongoDB进行比较,以完成手头的任务。

我有500个数字字段的5百万条记录,我必须使用这些数据为某些图形绘图提供不同的粒度级别。

我将数据泵入MongoDB并进入Mysql,在Mysql上我生成了一些具有10 / th,100 / th和1000 / th粒度的临时表。然后,应用程序选择与当前任务最匹配的正确表,然后在那里查询数据。

使用这种技术,我可以足够快地获得数据(<100毫秒)。 我使用的SQL查询是:

SELECT from_unixtime(CAST(FLOOR(MIN(STAMP/1000)) AS SIGNED INTEGER)),
MIN(RING),MIN(STATE),CAST(FLOOR(MIN(STAMP)) as SIGNED INTEGER),AVG(w21030401)
FROM project1 GROUP BY FLOOR((stamp - 1181589892000)/60000);

我使用相同的查询来创建临时表。唯一的区别是,有350个wXXXXXX字段。

INSERT INTO project1_10 (TTIME,RING,STATE,STAMP,w21030401,.........)
SELECT from_unixtime(CAST(FLOOR(MIN(STAMP/1000)) AS SIGNED INTEGER)),
MIN(RING),MIN(STATE),CAST(FLOOR(MIN(STAMP)) as SIGNED INTEGER),AVG(w21030401),.......
FROM project1 GROUP BY FLOOR((stamp - 1181589892000)/60000);

然后我尝试用MongoDB做同样的事情。 我将所有数据整合到MongoDB中,并在表单中获得了4,800万个文档:

{ "_id" : ObjectId("50040b3f0cf2872a8d3af90d"), "TTIME" : 
ISODate("2008-11-30T06:40:07Z"), "STAMP" : NumberLong("1228027207000"), 
"STATE" : 2531, "RING" : 1, "w13010096" : 34.991, "w13010097" : 1.432, 
"w23010001" : 292, "w18030180" : 84, "w18030380" : 95, "w21030002" : 51.113, 
"w21030005" : 60.321, "w21030004" : 274.662, "w21030008" : 149.629, 
"w21030009" : 126.565, "w21030010" : 576.296, ........... }

然后我尝试使用以下mapReduce生成临时文档:

keylist =  [ 'w21030401', 'w13011114', ....  ];

m = function (){
    var result = {};
    result['STAMP'] = this['STAMP'];
    result['RING'] = this['RING'];
    result['TTIME'] = this['TTIME'];
    result['STATE'] = this['STATE'];
    for(var key in keylist){
        if(key in this) {
            result[key] = this[key];
            result['cnt_' + key] = 1;
        }
    }
    var zone = Math.floor((this['STAMP'] - 1171004118000) / 1000000);
    emit( zone , result );
};
r = function (name, values){
    var result = {};
    result['STAMP'] = values[0]['STAMP'];
    result['RING'] = values[0]['RING'];
    result['TTIME'] = values[0]['TTIME'];
    result['STATE'] = values[0]['STATE'];
    for(var key in keylist) {
        result[key] = 0;
        result['cnt_' + key] = 0;
    }
    for ( var i=0; i<values.length; i++ ) {
        if(values[i]['STAMP'] < result['STAMP']) {
            result['STAMP'] = values[i]['STAMP'];
            result['TTIME'] = values[i]['TTIME'];
        }
        if(values[i]['RING'] < result['RING']) {
            result['RING'] = values[i]['RING'];
        }
        if(values[i]['STATE'] < result['STATE']) {
            result['STATE'] = values[i]['STATE'];
        }
        for(var key in keylist) {
            if(key in values[i]) {
                result[key] += values[i][key];
                result['cnt_' + key] += values[i]['cnt_' + key];
            }
        }
    }
    return result;
};
f = function(who, val){
    var result = {};
    result['STAMP'] = val['STAMP'];
    result['RING'] = val['RING'];
    result['TTIME'] = val['TTIME'];
    result['STATE'] = val['STATE'];
    for(var key in keylist) {
        if(key in val) {
            result[key] = val[key]/val['cnt_'+key];
        }
    }
    return result;
};



db.project1.mapReduce( m, r, { finalize : f, scope: { keylist: keylist }, out : {replace : 'project1_100'} , jsMode : false });

MySQL使用210秒来创建临时表,MongoDB使用了大约4个小时。

我的问题是: MongoDB不适合我的问题,我是否需要更大的硬件用于MongoDB而不是MySQL,或者我做错了什么我的MapReduce

由于

彼得

0 个答案:

没有答案