识别.csv文件的第一行和最后一行

时间:2014-12-08 21:45:48

标签: java csv hadoop emr hadoop2

我正在编写一个MapReduce代码,它将读取第一行,最后一行并向字符串添加一些内容,其余行按原样打印。我正在使用.csv文件并提取标签。但是,如果我像普通的Java程序一样使用它,那么我如何在MR代码的mapper部分中使用这个逻辑,因为我不明白mapper如何查找这些行。以下是代码。

public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
        String next, fileContent = value.toString();
        for (boolean first = true, last = (fileContent == null); !last; first = false, fileContent = next) {
            last = ((next = value.toString()) == null);
        if (first) {
         String[] tab = fileContent.split(",");
         String line = "var trip = [" + "\n" + "[" + tab[2] + "," + tab[3] + "," + tab[8] + "," + tab[11] + "," + tab[15] + "," + tab[16] + "],";
         text.set(line);
        }
        else if (last) {
         String[] tab = fileContent.split(",");
         String line = "[" + tab[2] + "," + tab[3] + "," + tab[8] + "," + tab[11] + "," + tab[15] + "," + tab[16] + "]" + "\n" + "];";
        text.set(line);
        }
        else {
         String[] tab = fileContent.split(",");
         String line = "[" + tab[2] + "," + tab[3] + "," + tab[8] + "," + tab[11] + "," + tab[15] + "," + tab[16] + "],";
        text.set(line);
        }
        }
        context.write(null, text);
        }
    }

0 个答案:

没有答案