Suppose you are iterating through a bag using the foreach
construct. Would you be able to, for each element of the bag except the first one, iterate through all previous elements up to the current one?
To make it easier to visualize, consider a foreach
loop that has reached the indicated position in the following bag:
(1, Element1)
(2, Element2)
(3, Element3)
(4, Element4)
(5, Element5) <-- You are here
(6, Element6)
(7, Element7)
What I would like to do at this point in the loop is run another loop through elements 1 to 4 then have the outer loop proceed to element 6 and the inner loop go through elements 1 to 5 and so on.
I highly doubt that this functionality could be achieved using some built in construct. Could this achieved by implementing a UDF which takes the current index and the bag as arguments, does the desired processing and returns the results to Pig?
答案 0 :(得分:2)
我担心你不能在'香草'猪身上做到这一点。你必须编写自定义UDF,这是一个起点:
public class MyUDF extends EvalFunc<Long>
public Long exec(Tuple input) throws IOException {
Object values = input.get(0);
if (values instanceof DataBag) {
// you logic goes here
}
return ...
}