public class MySpout implements IRichSpout{
private List fileName; //enter code here
public void nextTuple(){
File file = new File("D:/small progs/tika_document_type_detection.pdf");
fileName.add(file);
this.collector.emit(new Values(fileName));
}
}
In Bolt
public class MyBolt implements IRichBolt{
public void execute(Tuple tuple){
FileInputStream stream = new FileInputStream(tuple.getValues(0));
//can i use this stream obj to parse this file(using Apache tika)
}
}
这里我无法将文件对象从Spout传递给Bolt。我错过了什么吗?首先,我的问题是我们可以使用以下方法将一个物体从Spout传递给Bolt:
SpoutOutCollector collecot.emit(fileName)
这里fileName是包含文件对象的对象列表。
答案 0 :(得分:0)
您的代码无效,因为您的Spout会发出List
File
个File
个对象,这些对象会被序列化并发送到螺栓。因此,螺栓可以像这样接收public void execute(Tuple tuple){
List<File> files = (List<File>)tuple.getValue(0);
}
对象列表:
Spout.nextTuple()
但我想,那不是你想要的......
我想你想要从Spout到Bolt逐字节地发送文件的内容。为此,您需要在FileInputStream stream = null;
public void nextTuple() {
if(stream == null) {
stream = new FileInputStream(new File("..."));
}
this.collector.emit(new Values(new Byte(stream.read()));
}
中打开文件并逐字节读取它并通过emit发出一个字节:
public void execute(Tuple tuple){
Byte b = tuple.getByte(0);
}
当然,螺栓现在每个元组接收一个字节:
<clear></clear>