SQL to MapReduce:min子函数在where子句中,怎么写呢?

时间:2013-03-31 13:54:35

标签: mongodb mapreduce

这是我的收藏:

{
    "_id" : ObjectId("5156c3722653306612b4a4a0"),
    "ps_availqty" : 3325,
    "ps_supplycost" : 771.64,
    "ps_comment" : "requests after the carefully ironic ideas cajole alongside of the enticingly special accounts. fluffily regular deposits haggle about the blithely ironic deposits. regular requests sleep c",
    "ps_partkey" : {
        "p_partkey" : NumberLong(1),
        "p_name" : "goldenrod lace spring peru powder",
        "p_mfgr" : "Manufacturer#1",
        "p_brand" : "Brand#13",
        "p_type" : "PROMO BURNISHED COPPER",
        "p_size" : 7,
        "p_container" : "JUMBO PKG",
        "p_retailprice" : 901,
        "p_comment" : "final deposits s"
    },
    {
        "s_suppkey" : NumberLong(2),
        "s_name" : "Supplier#000000002",
        "s_address" : "89eJ5ksX3ImxJQBvxObC,",
        "s_phone" : "15-679-861-2259",
        "s_acctbal" : 4032.68,
        "s_comment" : "furiously stealthy frays thrash alongside of the slyly express deposits. blithely regular req",
        "s_nationkey" : {
            "n_nationkey" : NumberLong(5),
            "n_name" : "ETHIOPIA",
            "n_comment" : "fluffily ruthless requests integrate fluffily. pending ideas wake blithely acco",
            "n_regioin" : {
                "r_regionkey" : NumberLong(0),
                "r_name" : "AFRICA",
                "r_comment" : "special Tiresias about the furiously even dolphins are furi"
            }
        }
    }
}

sql查询是:

select
    s_acctbal, 
    s_name, 
    n_name, 
    p_partkey, 
    p_mfgr, 
    s_address, 
    s_phone, 
    s_comment
from 
    part, 
    supplier, 
    partsupp, 
    nation, 
    region
where 
    p_partkey = ps_partkey
    and s_suppkey = ps_suppkey
    and p_size = 15
    and p_type like '%BRASS'
    and s_nationkey = n_nationkey
    and n_regionkey = r_regionkey
    and r_name = 'EUROPE'
    and ps_supplycost = (
        select 
            min(ps_supplycost)
        from 
            partsupp, supplier, 
            nation, region
        where 
            p_partkey = ps_partkey
            and s_suppkey = ps_suppkey
            and s_nationkey = n_nationkey
            and n_regionkey = r_regionkey
            and r_name = 'EUROPE'
    )
order by 
    s_acctbal desc, 
    n_name, 
    s_name, 
    p_partkey;

第一个(也很简单的一步)我做到了。 MapReduce的:

db.runCommand({
    mapreduce: "partsupp",
    query: {
        "ps_partkey.p_size": 15,
        "ps_partkey.p_type": {'$regex': /BRASS$/},
        "ps_suppkey.s_nationkey.n_regioin.r_name": "EUROPE",
    },
    map: function() {
        emit(
            {
                s_acctbal:  this.ps_suppkey.s_acctbal, 
                s_name:     this.ps_suppkey.s_name, 
                n_name:     this.ps_suppkey.s_nationkey.n_name, 
                p_partkey:  this.ps_partkey.p_partkey, 
                p_mfgr:     this.ps_partkey.p_mfgr,
                s_address:  this.ps_suppkey.s_address, 
                s_phone:    this.ps_suppkey.s_phone, 
                s_comment:  this.ps_suppkey.s_comment
            }, 
            {
                ps_supplycost: this.ps_supplycost
            }
        );
    },
    reduce: function(key, values) {

    },
    out: 'query002'
});

但问题是如何计算min(ps_supplycost)? min函数在此查询中的含义是什么?

1 个答案:

答案 0 :(得分:0)

您可以在查询中添加特定字段:sortlimit

在通常的查询中,我们可以得到这样的最小值:

db.collection.find().sort({field:1}).limit(1);

只需添加sortlimit字段:

... 
query: {
    "ps_partkey.p_size": 15,
    "ps_partkey.p_type": {'$regex': /BRASS$/},
    "ps_suppkey.s_nationkey.n_regioin.r_name": "EUROPE",
},
sort: {ps_supplycost:1},
limit: 1,
map: function() { ...

这就是全部。应该补充一点,索引可能非常有用。使用.explain()。敢:)