Question

我有两个输入文件

学生档案：

abc 30 4.5
xyz 34 9.5
def 28 6.5
klm 35 10.5

位置文件：

abc hawthorne
xyz artesia
def garnet
klm vanness

我想要的输出

abc hawthorne
xyz artesia
def garnet
klm vanness

为实现这一目标，我写了以下猪计划。

A = LOAD '/user/hive/warehouse/students.txt' USING PigStorage(' ') AS (NAME:CHARARRAY,AGE:INT,GPA:FLOAT);
B = LOAD '/user/hive/warehouse/location.txt.txt' using PigStorage(' ') AS (NAME:CHARARRAY,LOCATION:CHARARRAY);
C = JOIN A BY NAME , B BY LOCATION USING 'replicated';
DUMP C;

麻烦的是我没有看到任何输出消息。最重要的是，我在执行时看到以下警告：

2014-01-22 15:18:15,829 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  - Encountered Warning ACCESSING_NON_EXISTENT_FIELD 2 time(s).
2014-01-22 15:18:15,829 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  - Encountered Warning ACCESSING_NON_EXISTENT_FIELD 2 time(s).
2014-01-22 15:18:15,829 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  - Success!
2014-01-22 15:18:15,829 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  - Success!
2014-01-22 15:18:15,832 [main] INFO  org.apache.pig.data.SchemaTupleBackend  - Key [pig.schematuple] was not set... will not generate code.
2014-01-22 15:18:15,832 [main] INFO  org.apache.pig.data.SchemaTupleBackend  - Key [pig.schematuple] was not set... will not generate code.
2014-01-22 15:18:15,841 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat  - Total input paths to process : 1
2014-01-22 15:18:15,841 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil  - Total input paths to process : 1
2014-01-22 15:18:15,841 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil  - Total input paths to process : 1
Hadoop Job IDs executed by Pig: job_201401210934_0082,job_201401210934_0083

Answer 1

我觉得你没有看到任何输出，因为加入不会导致任何匹配。您正在通过A （abc，xyz，def，klm） ＆amp;创建NAME联接来自B （hawthorne，artesia，garnet，vanness）的位置 如果你看到两个数据集中没有匹配的字符串，那么导致没有连接。

猪 - 复制加入

1 个答案: