这是this
的延续我的新猪脚本是:
register /usr/hdp/current/pig-client/lib/piggybank.jar
register /opt/elephantbird-jars/elephant-bird-core-4.5.jar
register /opt/elephantbird-jars/elephant-bird-hadoop-compat-4.5.jar
register /opt/elephantbird-jars/elephant-bird-pig-4.5.jar
register /opt/elephantbird-jars/json-simple-1.1.1.jar
data_input = LOAD 'local/path/for/hdfs/files' USING com.twitter.elephantbird.pig.load.JsonLoader() AS (json:map[]);
x = FOREACH data_input GENERATE json#'actor__id' AS actor_id, json#'actor__image__url' AS actor_image_url, json#'actor__displayName' AS actor_displayname, json#'actor__verification__adHocVerified' AS actor_verification, json#'actor__url' AS actor_url;
STORE x INTO '/tmp/user_posts' USING JsonStorage();
此代码以本地模式运行:pig -x local user_posts.pig
但它在mapreduce模式下失败:pig -x mapreduce user_posts.pig
我将jar移动到完全相同位置的所有数据节点。我不确定在哪里检查。有人可以指点一下吗?
答案 0 :(得分:0)
你错过了;
无处不在的
REGISTER '/me/home/elephant-bird-core-4.12.jar';
REGISTER '/me/home/elephant-bird-pig-4.12.jar';
REGISTER '/me/home/elephant-bird-hadoop-compat-4.12.jar';
答案 1 :(得分:0)
这是我的机器的一些问题,并没有猪的任何问题。我重新启动机器,一切顺利。