当我在本地运行我的猪脚本时(pig -file script.pig -param INPUT = val -param OUTPUT = val)一切正常。但是当我使用Oozie(协调员/工作流程)安排我的猪脚本时,脚本失败了。我不明白为什么......
有人能帮助我吗?
猪脚本
alarms = LOAD '$INPUT' USING PigStorage('|', '-noschema') AS (
row_num:long,
timestamp:chararray,
protocol_name:chararray,
source_ip:chararray,
destination_ip:chararray,
source_port:int,
destination_port:int
);
alarms_projection = FOREACH alarms {
GENERATE
SUBSTRING(timestamp, 0, 10) as alarm_date:chararray,
SUBSTRING(timestamp, 11, 19) as alarm_time:chararray,
protocol_name,
source_ip,
destination_ip,
source_port,
destination_port;
}
STORE alarms_projection INTO '$OUTPUT' USING PigStorage('|');
错误
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.PigMain], exception invoking main(), Scheme not present in uri /etl/av/complete/alarms
org.apache.oozie.action.hadoop.LauncherException: Scheme not present in uri /etl/av/complete/alarms
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:177)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.oozie.action.hadoop.LauncherException: Scheme not present in uri /etl/av/complete/alarms
at org.apache.oozie.action.hadoop.LauncherURIHandlerFactory.getURIHandler(LauncherURIHandlerFactory.java:41)
at org.apache.oozie.action.hadoop.PrepareActionsDriver.doOperations(PrepareActionsDriver.java:65)
at org.apache.oozie.action.hadoop.LauncherMapper.executePrepare(LauncherMapper.java:444)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:173)
... 8 more
Oozie Launcher failed, finishing Hadoop job gracefully
答案 0 :(得分:0)
这是workflow.xml中的配置错误。我正在使用prepare语句来清空输出目录。但不是设置路径hdfs:// node:port / path / to / the / file我使用/ path / to / the /。
使用准备的正确方法
<prepare>
<delete path="hdfs://node:8020/path/to/files"/>
</prepare>