PIG UDF PigGeolocationUDF不起作用

时间:2014-07-30 10:52:28

标签: hadoop geolocation apache-pig user-defined-functions

我的问题在StackOverflow上有类似的问题,但是它们中没有特别帮助我......

我从“实践中的Hadoop”中复制了一些代码并添加了一些代码(getCacheFiles)来编写PigGeolocationUDF ...但是当我将代码打包到JAR并在grunt中执行时,这不起作用 - shell(MapReduce模式)

有人能帮助我吗?我的失败在哪里?

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.DataType;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.logicalLayer.schema.Schema;
import com.maxmind.geoip.LookupService;


public class PigGeolocationUDF extends EvalFunc<String> {
    private LookupService geoloc;
    private static final String COUNTRY = "country";


    public List<String> getCacheFiles() {
        List<String> list = new ArrayList<String>(1);
        list.add("/tmp/GeoLiteCity.dat#GeoLiteCity.dat");
        return list;
    }

    public String exec(Tuple input) throws IOException {

        if (input == null || input.size() == 0) {
            return null;
        }

        Object object = input.get(0);
        if (object == null) {
            return null;
        }

        String ip = (String) object;

        return lookup(ip);
    }

    protected String lookup(String ip) throws IOException {

        if (geoloc == null) {

            geoloc = new LookupService("GeoLiteCity.dat", LookupService.GEOIP_MEMORY_CACHE);



        }

        String country = geoloc.getLocation(ip).countryCode + "\t"
                + geoloc.getLocation(ip).countryName + "\t"
                + geoloc.getLocation(ip).region + "\t"
                + geoloc.getLocation(ip).postalCode + "\t"
                + geoloc.getLocation(ip).city;

        if ("N/A".equals(country)) {
            return null;
        }

        return country;

    }

    @Override
    public Schema outputSchema(Schema input) {
        return new Schema(new Schema.FieldSchema(COUNTRY, DataType.CHARARRAY));
    }

}

0 个答案:

没有答案