我写了猪脚本:
my_script.pig
bag_1 = LOAD '$INPUT' USING PigStorage('|') AS (LN_NR:chararray,ET_NR:chararray,ET_ST_DT:chararray,ED_DT:chararray,PI_ID:chararray);
bag_2 = LIMIT bag_1 $SIZE;
DUMP bag_2;
并将一个param文件设为:
my_param.txt:
INPUT = hdfs://0.0.0.0:8020/user/training/example
SIZE = 10
现在,我正在通过
调用脚本pig my_param.txt my_script.pig
此命令但是收到错误:
错误1000:解析期间出错。词汇错误
对此的任何建议
答案 0 :(得分:0)
我认为您需要使用 -m 或 -param_file 选项提供参数文件。请参阅下面的帮助文档。
$ pig --help
Apache Pig version 0.11.0-cdh4.7.1 (rexported)
compiled Nov 18 2014, 09:08:23
USAGE: Pig [options] [-] : Run interactively in grunt shell.
Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).
Pig [options] [-f[ile]] file : Run cmds found in file.
options include:
-4, -log4jconf - Log4j configuration file, overrides log conf
-b, -brief - Brief logging (no timestamps)
-c, -check - Syntax check
-d, -debug - Debug level, INFO is default
-e, -execute - Commands to execute (within quotes)
-f, -file - Path to the script to execute
-g, -embedded - ScriptEngine classname or keyword for the ScriptEngine
-h, -help - Display this message. You can specify topic to get help for that topic.
properties is the only topic currently supported: -h properties.
-i, -version - Display version information
-l, -logfile - Path to client side log file; default is current working directory.
-m, -param_file - Path to the parameter file
-p, -param - Key value pair of the form param=val
-r, -dryrun - Produces script with substituted parameters. Script is not executed.
-t, -optimizer_off - Turn optimizations off. The following values are supported:
SplitFilter - Split filter conditions
PushUpFilter - Filter as early as possible
MergeFilter - Merge filter conditions
PushDownForeachFlatten - Join or explode as late as possible
LimitOptimizer - Limit as early as possible
ColumnMapKeyPrune - Remove unused data
AddForEach - Add ForEach to remove unneeded columns
MergeForEach - Merge adjacent ForEach
GroupByConstParallelSetter - Force parallel 1 for "group all" statement
All - Disable all optimizations
All optimizations listed here are enabled by default. Optimization values are case insensitive.
-v, -verbose - Print all error messages to screen
-w, -warning - Turn warning logging on; also turns warning aggregation off
-x, -exectype - Set execution mode: local|mapreduce, default is mapreduce.
-F, -stop_on_failure - Aborts execution on the first failed job; default is off
-M, -no_multiquery - Turn multiquery optimization off; default is on
-P, -propertyFile - Path to property file
$
答案 1 :(得分:0)
您没有正确使用该命令。
要使用属性文件,请在命令中使用 -param_file :
pig -param_file <file> pig_script.pig
中查看更多详情