ERROR 1128:找不到字段dryTemp

时间:2014-10-21 23:35:35

标签: hadoop apache-pig hadoop-plugins

我的猪是运行代码温度而我是一个错误,把下面的代码和错误,以便于理解我的问题发生。

错误在第38行第15行,试图删除dryTemp,但也给出了另一个错误。

代码:

 --Load files into relations
    month1 = LOAD 'hdfs:/data/big/data/weather/weather/201201hourly.txt' USING PigStorage(',');
    month2 = LOAD 'hdfs:/data/big/data/weather/weather/201202hourly.txt' USING PigStorage(',');
    month3 = LOAD 'hdfs:/data/big/data/weather/weather/201203hourly.txt' USING PigStorage(',');
    month4 = LOAD 'hdfs:/data/big/data/weather/weather/201204hourly.txt' USING PigStorage(',');
    month5 = LOAD 'hdfs:/data/big/data/weather/weather/201205hourly.txt' USING PigStorage(',');
    month6 = LOAD 'hdfs:/data/big/data/weather/weather/201206hourly.txt' USING PigStorage(',');

    --Combine relations
    months = UNION month1, month2, month3, month4, month5, month6;

    /* Splitting relations
    SPLIT months INTO 
            splitMonth1 IF SUBSTRING(date, 4, 6) == '01',
            splitMonth2 IF SUBSTRING(date, 4, 6) == '02',
            splitMonth3 IF SUBSTRING(date, 4, 6) == '03',
            splitRest IF (SUBSTRING(date, 4, 6) == '04' OR SUBSTRING(date, 4, 6) == '04');
    */

    /*  Joining relations

    stations = LOAD 'hdfs:/data/big/data/QCLCD201211/stations.txt' USING PigStorage() AS (id:int, name:chararray)

    JOIN months BY wban, stations by id;

    */

    --filter out unwanted data
    clearWeather = FILTER months BY skyCondition == 'CLR';

    --Transform and shape relation
    shapedWeather = FOREACH clearWeather GENERATE date, SUBSTRING(date, 0, 4) as year, SUBSTRING(date, 4, 6) as month, SUBSTRING(date, 6, 8) as day, skyCondition, dryTemp;

    --Group relation specifying number of reducers
    groupedByMonthDay = GROUP shapedWeather BY (month, day) PARALLEL 10;

    --Aggregate relation
    aggedResults = FOREACH groupedByMonthDay GENERATE group as MonthDay, AVG(shapedWeather.dryTemp), MIN(shapedWeather.dryTemp), MAX(shapedWeather.dryTemp), COUNT(shapedWeather.dryTemp) PARALLEL 10;

    --Sort relation
    sortedResults = ORDER aggedResults BY $1 DESC;

    --Store results in HDFS
    STORE sortedResults INTO 'hdfs:/data/big/data/weather/pigresults' USING PigStorage(':');

记下错误,他有点大,对猪还不太了解,我还在研究,我相信错误与未识别的变量类型有关但不知道修复它希望能帮到我。

错误:

ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1691)
    at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
    at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
    at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
    at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
    at org.apache.pig.Main.run(Main.java:607)
    at org.apache.pig.Main.main(Main.java:156)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: Failed to parse: Pig script failed to parse: 
<file Documentos/pig/weather.pig, line 38, column 15> pig script failed to validate: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
    ... 15 more
Caused by: 
<file Documentos/pig/weather.pig, line 38, column 15> pig script failed to validate: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
    at org.apache.pig.parser.LogicalPlanBuilder.buildForeachOp(LogicalPlanBuilder.java:1017)
    at org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:15870)
    at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1933)
    at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
    at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
    at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
    ... 16 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1128: Cannot find field dryTemp in :bytearray,year:chararray,month:chararray,day:chararray,:bytearray,:bytearray
    at org.apache.pig.newplan.logical.expression.DereferenceExpression.translateAliasToPos(DereferenceExpression.java:215)
    at org.apache.pig.newplan.logical.expression.DereferenceExpression.getFieldSchema(DereferenceExpression.java:149)
    at org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:264)
    at org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:148)
    at org.apache.pig.newplan.logical.expression.DereferenceExpression.accept(DereferenceExpression.java:84)
    at org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
    at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
    at org.apache.pig.newplan.logical.optimizer.SchemaResetter.visitAll(SchemaResetter.java:67)
    at org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:122)
    at org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:245)
    at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
    at org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:114)
    at org.apache.pig.parser.LogicalPlanBuilder.buildForeachOp(LogicalPlanBuilder.java:1015)
    ... 22 more

以下是201211 hourly.txt文件的几行:

WBAN,日期,时间,StationType,SkyCondition,SkyConditionFlag,可见性,VisibilityFlag,WeatherType,WeatherTypeFlag,DryBulbFarenheit,DryBulbFarenheitFlag,DryBulbCelsius,DryBulbCelsiusFlag,WetBulbFarenheit,WetBulbFarenheitFlag,WetBulbCelsius,WetBulbCelsiusFlag,DewPointFarenheit,DewPointFarenheitFlag,DewPointCelsius,DewPointCelsiusFlag,相对湿度,RelativeHumidityFlag ,风速,WindSpeedFlag,WindDirection,WindDirectionFlag,ValueForWindCharacter,ValueForWindCharacterFlag,StationPressure,StationPressureFlag,PressureTendency,PressureTendencyFlag,PressureChange,PressureChangeFlag,SeaLevelPressure,SeaLevelPressureFlag,记录类型,RecordTypeFlag,HourlyPrecip,HourlyPrecipFlag,高度计,AltimeterFlag 03011,20120101,0015,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.43, 03011,20120101,0035,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.43, 03011,20120101,0055,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.44, 03011,20120101,0115,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.44, 03011,20120101,0135,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.45, 03011,20120101,0155,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.45, 03011,20120101,0215,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.46, 03011,20120101,0235,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.47, 03011,20120101,0255,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.48, 03011,20120101,0315,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.47, 03011,20120101,0335,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.47, 03011,20120101,0355,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,, M ,, AA ,,,, 30.46, 03011,20120101,0415,0,CLR ,, 10.00 ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,, M ,, AA ,,,,, 30.46,

2 个答案:

答案 0 :(得分:0)

看起来您正在加载&#39; month1&#39;,&#39; month2&#39;等,而不指定架构(您应该指定&#39; dryTemp&#39;)。您可以尝试以下方式:

month1 = LOAD 'hdfs:/data/big/data/weather/201201hourly.txt' USING PigStorage(',') 
         AS (wban,year_month_day,time,station_type,maint_indic,
            sky_cond,visibility,weather_type,dryTemp);

同样适用于所有其他月份。

由于

答案 1 :(得分:0)

我在你的剧本中做了一些修改,
1.使用适当的模式加载数据(您可以根据需要更改每个字段的数据类型)
2.将所有6个负载优化为1个负载 3.删除了注释代码

我已经使用您的输入测试了下面的猪脚本,并且其工作正常,也粘贴了输出。

PigScript:

--Load all the files into relations
 months = LOAD 'hdfs:/data/big/data/weather/weather/20120[1-6]hourly.txt' USING PigStorage(',') AS (WBAN:int,Date:chararray,Time:chararray,StationType:int,SkyCondition:chararray,SkyConditionFlag,Visibility,VisibilityFlag,WeatherType,WeatherTypeFlag,DryBulbFarenheit:int,DryBulbFarenheitFlag,DryBulbCelsius:double,DryBulbCelsiusFlag,WetBulbFarenheit:int,WetBulbFarenheitFlag,WetBulbCelsius:double,WetBulbCelsiusFlag,DewPointFarenheit,DewPointFarenheitFlag,DewPointCelsius,DewPointCelsiusFlag,RelativeHumidity,RelativeHumidityFlag,WindSpeed,WindSpeedFlag,WindDirection,WindDirectionFlag,ValueForWindCharacter,ValueForWindCharacterFlag,StationPressure,StationPressureFlag,PressureTendency,PressureTendencyFlag,PressureChange,PressureChangeFlag,SeaLevelPressure,SeaLevelPressureFlag,RecordType,RecordTypeFlag,HourlyPrecip,HourlyPrecipFlag,Altimeter,AltimeterFlag);

--filter out unwanted data
    clearWeather = FILTER months BY SkyCondition == 'CLR';

--Transform and shape relation
    shapedWeather = FOREACH clearWeather GENERATE Date,
                           SUBSTRING(Date,0,4) AS year,
                           SUBSTRING(Date,4,6) AS month,
                           SUBSTRING(Date,6,8) AS day,
                           SkyCondition,
                           DryBulbFarenheit AS dryTemp;

--Group relation specifying number of reducers
    groupedByMonthDay = GROUP shapedWeather BY (month, day) PARALLEL 10;

--Aggregate relation
    aggedResults = FOREACH groupedByMonthDay GENERATE group as MonthDay, AVG(shapedWeather.dryTemp), MIN(shapedWeather.dryTemp), MAX(shapedWeather.dryTemp), COUNT(shapedWeather.dryTemp) PARALLEL 10;

--Sort relation
    sortedResults = ORDER aggedResults BY $1 DESC;

--Store results in HDFS
    STORE sortedResults INTO 'hdfs:/data/big/data/weather/pigresults' USING PigStorage(':');

输出(基于您的上述输入样本)

   (01,01):21.615384615384617:21:23:13

 MonthDay:(01,01)
 Avg:21.615384615384617
 Min:21
 Max:23
 Count:13