在pig中加载文件时出错:

时间:2017-02-18 13:06:19

标签: hadoop hadoop2

我正在尝试在终端中执行pig脚本,我收到以下错误:

INFO  [Thread-13] org.apache.hadoop.util.NativeCodeLoader     - Loaded the native-hadoop library
WARN  [Thread-13]    org.apache.hadoop.mapred.JobClient     - No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
INFO  [Thread-13]    org.apache.hadoop.mapred.JobClient     - Cleaning up the staging area file:/tmp/hadoop-biadmin/mapred/staging/biadmin-341199244/.staging/job_local_0001
ERROR [Thread-13] org.apache.hadoop.security.UserGroupInformation     - PriviledgedActionException as:biadmin cause:org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: file:/home/biadmin/PIGData/books.csv
ERROR [main] org.apache.pig.tools.pigstats.SimplePigStats     - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: file:/home/biadmin/PIGData/books.csv
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:285)
        at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1024)
        at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1041)
        at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:959)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:912)
        at java.security.AccessController.doPrivileged(AccessController.java:310)
        at javax.security.auth.Subject.doAs(Subject.java:573)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:912)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:886)
        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
        at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
        at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
        at java.lang.Thread.run(Thread.java:738)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/home/biadmin/PIGData/books.csv
        at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
        at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:273)
        ... 15 more

ERROR [main] org.apache.pig.tools.pigstats.PigStatsUtil     - 1 map reduce job(s) failed!
ERROR [main]      org.apache.pig.tools.grunt.Grunt     - ERROR 1066: Unable to open iterator for alias b
Details at logfile: /opt/ibm/biginsights/pig/bin/pig_1487413261020.log

任何人都可以帮我解决这个问题吗?

代码:

data = LOAD '/home/biadmin/PIGData/books.csv';
b = FOREACH data GENERATE $0;
DUMP b;

1 个答案:

答案 0 :(得分:1)

基于上述异常,输入文件不在给定的路径文件中:/home/biadmin/PIGData/books.csv。 (这是本地文件系统路径)

Pig有两种执行模式:
1.本地模式(处理本地文件系统文件)
$ pig -x local
2. Mapreduce模式(处理HDFS文件系统文件)
$ pig或$ pig -x mapreduce

确保以适当的模式运行pig脚本。