Question

我必须写一个mapreduce工作，但我不知道如何去做，

我有jar MARD.jar，通过它我可以实例化MARD对象。使用我在其上调用mard.normalize文件meathod即mard.normaliseFile（一堆参数）。

此转换会创建某个输出文件。

要使规范化的meathod运行，需要在工作目录中使用名为myMard的文件夹。所以我认为我会将myMard文件夹作为hadoop作业的输入路径，但我不确定这是否有帮助beacuse mard.normaliseFile（一堆参数）将搜索工作目录中的myMard文件夹，但它不会发现它（**这就是我的想法）Mapper只能通过从fileSplit获得的“值”访问文件内容，它不能直接访问myMard文件夹中的文件。

简而言之，我必须通过MapReduce

执行以下代码

File setupFolder = new File(setupFolderName);

setupFolder.mkdirs();



MARD mard = new MARD(setupFolder);

Text valuz = new Text();

IntWritable intval = new IntWritable();

File original = new File("Vca1652.txt");

File mardedxml = new File("Vca1652-mardedxml.txt");

File marded = new File("Vca1652-marded.txt");



mardedxml.createNewFile();

marded.createNewFile();

NormalisationStats stats;

try {

stats = mard.normaliseFile(original,mardedxml,marded,50.0);

//This meathod requires access to the myMardfolder


System.out.println(stats);

} catch (MARDException e) {

// TODO Auto-generated catch block

e.printStackTrace();

}

请帮忙

如何设计我的映射器？

0 个答案: