我正在尝试重现 MapReduce设计模式的 Bloom Filtering示例。 在下文中,我将仅显示感兴趣的代码:
public static class BloomFilteringMapper extends Mapper<Object, Text, Text, NullWritable>
{
private BloomFilter filter = new BloomFilter();
protected void setup( Context context ) throws IOException
{
URI[] files = DistributedCache.getCacheFiles( context.getConfiguration() );
String path = files[0].getPath();
System.out.println( "Reading Bloom Filter from: " + path );
DataInputStream strm = new DataInputStream( new FileInputStream( path ) );
filter.readFields( strm );
strm.close();
}
//...
}
public static void main( String[] args ) throws Exception
{
Job job = new Job( new Configuration(), "description" );
URI uri = new URI("hdfs://localhost:9000/user/draxent/comment.bloomfilter");
DistributedCache.addCacheFile( uri, job.getConfiguration() );
//...
}
当我尝试执行它时,我收到以下错误:
java.io.FileNotFoundException:/user/draxent/comment.bloomfilte r
但执行命令:
bin/hadoop fs -ls
我可以看到文件:
-rw-r--r-- 1 draxent supergroup 405 2015-11-25 17:12 /user/draxent/comment.bloomfilter
所以我很确定问题就出现了:
URI uri = new URI("hdfs://localhost:9000/user/draxent/comment.bloomfilter");
但我尝试了几种不同的配置,例如:
“hdfs://user/draxent/comment.bloomfilter”
“/ user / draxent / comment.bloomfilter”
“comment.bloomfilter”
没有人工作。
我试图查看cfeduke implementation,但我无法解决问题。
回答评论:
答案 0 :(得分:1)
已弃用分布式缓存API。
您可以使用新API扩展相同的功能。请在此处查看文档:{{3}}
在驱动程序代码中: -
Path[] localPaths = context.getLocalCacheFiles();
在映射器设置方法中: -
$scope.child = {};
if ($scope.child.Cars1 == undefined)
$scope.child.Cars1 = [];
var all = angular.copy($scope.parent.Cars1);
var test = [];
angular.forEach(all, function (obj, key) {
test.push(
{
name: obj,
operation: 1, // this value is used to tick all radio button of clear and save
selected: true // this value is used to check all checkbox to true by default
}
);
$scope.child.Cars1 = test;
})
答案 1 :(得分:0)
以下应该有效:
删除URI uri = new URI(...
行并将下一行更改为:
DistributedCache.addCacheFile(new Path("/user/draxent/comment.bloomfilter").toUri(), job.getConfiguration());