无法使用Azure PowerShell执行pig脚本

时间:2015-04-14 23:41:17

标签: powershell azure hadoop apache-pig hdinsight

这是我的猪脚本

$QueryString = "A =  load 'wasb://$containername@$StorageAccount.blob.core.windows.net/table1' using PigStorage(',') as (col1 chararray,col2 chararray,col3 chararray,col4 chararray,col5 chararray,col6 chararray,col7 int,col8 int);" +
"user_list = foreach A GENERATE $0;" +
"unique_user = DISTINCT user_list;" +
"unique_users_group = GROUP unique_user ALL;" +
"uu_count = FOREACH unique_users_group GENERATE COUNT(unique_user);" +
"DUMP uu_count;"

当我执行上面的猪脚本

时,我收到此错误
'2015-04-14 23:17:55,177 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. <line 1, column 166>  mismatched input 'chararray' expecting RIGHT_PAREN
Failed to parse: <line 1, column 166>  mismatched input 'chararray' expecting RIGHT_PAREN
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:241)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:179)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:769)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:509)
at org.apache.pig.Main.main(Main.java:156)
2015-04-14 23:17:55,177 [main] ERROR org.apache.pig.tools.grunt.Grunt -   ERROR 1200: <line 1, column 166>  mismatched input 'chararray' expecting RIGHT_PAREN

我编辑了这样的LOAD语句,其余的脚本是相同的

$QueryString = "A =  load 'wasb://$containername@$StorageAccount.blob.core.windows.net/table1';" +

我现在得到的错误是

2015-04-14 23:23:00,117 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. <line 1, column 162>  Syntax error, unexpected symbol at or near ';'
Failed to parse: <line 1, column 162>  Syntax error, unexpected symbol at or near ';'
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:241)
at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:179)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:769)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:509)
at org.apache.pig.Main.main(Main.java:156)
2015-04-14 23:23:00,132 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 1, column 162>  Syntax error, unexpected symbol at or near ';'
Details at logfile: C:\apps\dist\hadoop-2.4.0.2.1.9.0-2196\logs\pig_1429053777602.log

我不明白错误是什么。你能帮助我在windows powershell上执行这个查询吗(我正在使用windows powershell ISE,所以我可以编辑查询)

terminal execution

1 个答案:

答案 0 :(得分:1)

问题出在本声明user_list = foreach A GENERATE $0;。 PowerShell将 $ 0 解释为参数,由于未定义,PowerShell将替换空字符串。您可以在脚本中定义参数,例如$0 = '$0';,或者只需转义 $ ,如:

user_list = foreach A GENERATE `$0;

PowerShell使用`(反引号,“1”键旁边)作为双引号字符串的转义字符。

所以脚本看起来像:

$0 = '$0';
$QueryString = "A =  load 'wasb://$containerName@$storageAccountName.blob.core.windows.net/table1' using PigStorage(',') as (col1,col2,col3,col4,col5,col6,col7,col8) ;"+
"user_list = foreach A GENERATE $0;" +
"unique_user = DISTINCT user_list;" +
"unique_users_group = GROUP unique_user ALL;" +
"uu_count = FOREACH unique_users_group GENERATE COUNT(unique_user);" +
"DUMP uu_count;"

$QueryString = "A =  load 'wasb://$containerName@$storageAccountName.blob.core.windows.net/table1' using PigStorage(',') as (col1,col2,col3,col4,col5,col6,col7,col8) ;"+
"user_list = foreach A GENERATE `$0;" +
"unique_user = DISTINCT user_list;" +
"unique_users_group = GROUP unique_user ALL;" +
"uu_count = FOREACH unique_users_group GENERATE COUNT(unique_user);" +
"DUMP uu_count;"