在Apache Storm中是否可以将文件从Spout传递给Bolt?

时间:2016-03-31 10:39:43

标签: apache-storm

public class MySpout implements IRichSpout{
    private List fileName;  //enter code here

    public void nextTuple(){    
        File file = new File("D:/small progs/tika_document_type_detection.pdf");
        fileName.add(file);
        this.collector.emit(new Values(fileName));
    }

}

In Bolt
public class MyBolt implements IRichBolt{
    public void execute(Tuple tuple){
        FileInputStream stream = new FileInputStream(tuple.getValues(0));
        //can i use this stream obj to parse this file(using Apache tika)
    }
}

这里我无法将文件对象从Spout传递给Bolt。我错过了什么吗?首先,我的问题是我们可以使用以下方法将一个物体从Spout传递给Bolt:

SpoutOutCollector collecot.emit(fileName)

这里fileName是包含文件对象的对象列表。

1 个答案:

答案 0 :(得分:0)

您的代码无效,因为您的Spout会发出List FileFile个对象,这些对象会被序列化并发送到螺栓。因此,螺栓可以像这样接收public void execute(Tuple tuple){ List<File> files = (List<File>)tuple.getValue(0); } 对象列表:

Spout.nextTuple()

但我想,那不是你想要的......

我想你想要从Spout到Bolt逐字节地发送文件的内容。为此,您需要在FileInputStream stream = null; public void nextTuple() { if(stream == null) { stream = new FileInputStream(new File("...")); } this.collector.emit(new Values(new Byte(stream.read())); } 中打开文件并逐字节读取它并通过emit发出一个字节:

public void execute(Tuple tuple){
    Byte b = tuple.getByte(0);
}

当然,螺栓现在每个元组接收一个字节:

<clear></clear>