从JSON数组中读取/提取$ date字段值

时间:2015-12-01 07:33:58

标签: apache-pig

我有一个文件,其中每一行都是一个JSON对象(实际上,它是stackoverflow的转储)。我能够轻松地将它加载到Apache Pig中,但是我在从LastActivityDate读取/提取“$ date”时遇到了麻烦。你能帮我解决一下如何从“LastActivityDate”获得$ date值?

以下是一个例子:

{ 
"_id" : { "$oid" : "506492073401d91fa7fdffbe" }, 
"Body" : "....", 
"ViewCount" : 7351, 
"LastEditorDisplayName" : "Rich B", 
"Title" : ".....", 
"LastEditorUserId" : 140328, 
"LastActivityDate" : { "$date" : 1314819738077 }, 
"LastEditDate" : { "$date" : 1313882544213 }, 
"AnswerCount" : 12, "CommentCount" : 19, 
"AcceptedAnswerId" : 7, 
"Score" : 83, 
"PostTypeId" : "question", 
"OwnerUserId" : 8, 
"Tags" : [ "c#", "winforms" ], 
"CreationDate" : { "$date" : 1217540572667 }, 
"FavoriteCount" : 13, "Id" : 4, 
"ForumName" : "stackoverflow.com" 
}

我收到以下错误:

grunt> entitleGen1 = FOREACH entitleGen GENERATE id, ViewCount, LastActivityDate#'$date' as "LastActivityDate" :chararray;
2015-12-01 05:42:19,840 [main] ERROR org.apache.pig.impl.PigContext - Undefined parameter : date

0 个答案:

没有答案