我有两个带有一个公共字段的文件,根据我需要获取第二个文件值的字段值。
如何在此处添加where条件?
还有其他PIPE可供NOT IN使用吗?
File1中:
tcno,date,amt
1234,3/10/2016,1000
1234,3/11/2016,400
23456,2/10/2016,1500
文件2:
cno,fname,lname,city,phone,mail
1234,first,last,city,1234556,123@123.com
示例代码:
Pipe pipe1 = new Pipe("custPipe");
Pipe pipe2 = new Pipe("tscnPipe");
Fields cJoinField = new Fields("cno");
Fields tJoinField = new Fields("tcno");
Pipe pipe = new HashJoin(pipe1, cJoinField, pipe2, tJoinField, new OuterJoin());
//HOW TO ADD WHERE CONDITION i.e. CNO IS NULL FROM SECOND FILE
Fields outFields = new Fields("tcno","tdate", "tamt");
我希望输出作为第一个文件的最后一行[23456,2/10/2016,1500
]
答案 0 :(得分:3)
根据代码中的评论:
//HOW TO ADD WHERE CONDITION i.e. CNO IS NULL FROM SECOND FILE
尝试使用FilterNull
。
在HashJoin
步骤
FilterNull filterNull = new FilterNull();
pipe = new Each( pipe, cJoinField, filterNull );
类似的东西:
Pipe pipe1 = new Pipe("custPipe");
Pipe pipe2 = new Pipe("tscnPipe");
Fields cJoinField = new Fields("cno");
Fields tJoinField = new Fields("tcno");
Pipe pipe = new HashJoin(pipe1, cJoinField, pipe2, tJoinField, new OuterJoin());
// Filter out those tuples which has cno as null
FilterNull filterNull = new FilterNull();
pipe = new Each( pipe, cJoinField, filterNull );
Fields outFields = new Fields("tcno","tdate", "tamt");