整合Pig和MongoDB-如何从MongoDB加载Pig中的嵌套文档?

时间:2018-08-06 05:07:21

标签: mongodb apache-pig

我正在尝试使用MongoLoader在Pig中加载MongoDB集合。以下是我在MongoDB中拥有的文档的结构:-

{
    "_id" : ObjectId("5b66f9612b8cc21fc5c3dcc2"),
    "address" : {
            "building" : "759",
            "coord" : [
                    -73.9925306,
                    40.7309346
            ],
            "street" : "Broadway",
            "zipcode" : "10003"
    },
    "borough" : "Manhattan",
    "cuisine" : "Delicatessen",
    "grades" : [
            {
                    "date" : ISODate("2014-01-21T00:00:00Z"),
                    "grade" : "A",
                    "score" : 12
            },
            {
                    "date" : ISODate("2013-01-04T00:00:00Z"),
                    "grade" : "A",
                    "score" : 11
            },
            {
                    "date" : ISODate("2012-06-07T00:00:00Z"),
                    "grade" : "A",
                    "score" : 6
            },
            {
                    "date" : ISODate("2012-01-17T00:00:00Z"),
                    "grade" : "A",
                    "score" : 8
            }
    ],
    "name" : "Bully'S Deli",
    "restaurant_id" : "40361708"

}

我正在尝试访问成绩字段,该字段是一系列文档。 为此,我在Pig中使用以下命令:-

grunt> register mongo-hadoop-core-1.5.2.jar;
grunt> register mongo-hadoop-pig-1.5.2.jar;
grunt> register mongo-java-driver-2.13.2.jar;
grunt> restaurants = load 
'mongodb://127.0.0.1:27017/restaurants_DB.restaurants' using 
com.mongodb.hadoop.pig.MongoLoader('borough:chararray, grades:map[], nam
e:chararray');

我正在使用地图来获取成绩,但是它不起作用,并且没有从mongoDB中读取字段。我遇到以下错误:-

2018-08-06 04:48:27,235 [LocalJobRunner Map Task Executor #0] WARN  
com.mongodb.hadoop.pig.BSONLoader - Type MAP for field grades can not be 
applied to class com.mongodb.BasicDBList

任何人都可以建议如何正确阅读成绩字段。

0 个答案:

没有答案