Apache Flink - 使用数据流中的值动态创建流数据源

时间:2016-12-18 16:11:39

标签: apache-flink

我正在尝试使用Apache Flink构建一个示例应用程序,它执行以下操作:

  1. 从Kafka队列中读取股票代码流(例如'CSCO','FB')。
  2. 对于每个符号,执行当前价格的实时查找并流式传输下游处理的值。
  3. *更新至原始帖子*

    我将map函数移动到一个单独的类中,并且没有得到运行时错误消息“ MapFunction的实现不再可序列化。该对象可能包含或引用非可序列化字段

    我现在面临的问题是,我试图写价格的卡夫卡主题“股票价格”没有收到它们。我正在尝试解决问题并发布任何更新。

    public class RetrieveStockPrices { 
        @SuppressWarnings("serial") 
        public static void main(String[] args) throws Exception { 
            final StreamExecutionEnvironment streamExecEnv = StreamExecutionEnvironment.getExecutionEnvironment();
            streamExecEnv.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime); 
    
            Properties properties = new Properties(); 
            properties.setProperty("bootstrap.servers", "localhost:9092"); 
            properties.setProperty("zookeeper.connect", "localhost:2181"); 
            properties.setProperty("group.id", "stocks"); 
    
            DataStream<String> streamOfStockSymbols = streamExecEnv.addSource(new FlinkKafkaConsumer08<String>("stocksymbol", new SimpleStringSchema(), properties)); 
    
            DataStream<String> stockPrice = 
                streamOfStockSymbols 
                //get unique keys 
                .keyBy(new KeySelector<String, String>() { 
                    @Override 
                    public String getKey(String trend) throws Exception {
                        return trend; 
                    }
                    }) 
                //collect events over a window 
                .window(TumblingEventTimeWindows.of(Time.seconds(60))) 
                //return the last event from the window...all elements are the same "Symbol" 
                .apply(new WindowFunction<String, String, String, TimeWindow>() {
                    @Override 
                    public void apply(String key, TimeWindow window, Iterable<String> input, Collector<String> out) throws Exception { 
                        out.collect(input.iterator().next().toString()); 
                    }
                })
                .map(new StockSymbolToPriceMapFunction());
    
            streamExecEnv.execute("Retrieve Stock Prices"); 
        }
    }
    
    public class StockSymbolToPriceMapFunction extends RichMapFunction<String, String> {
        @Override
        public String map(String stockSymbol) throws Exception {
            final StreamExecutionEnvironment streamExecEnv = StreamExecutionEnvironment.getExecutionEnvironment();
            streamExecEnv.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime);
            System.out.println("StockSymbolToPriceMapFunction: stockSymbol: " + stockSymbol);
    
            DataStream<String> stockPrices = streamExecEnv.addSource(new LookupStockPrice(stockSymbol));
            stockPrices.keyBy(new CustomKeySelector()).addSink(new FlinkKafkaProducer08<String>("localhost:9092", "stockprices", new SimpleStringSchema()));
    
            return "100000";
        }
    
        private static class CustomKeySelector implements KeySelector<String, String> {
            @Override
            public String getKey(String arg0) throws Exception {
                return arg0.trim();
            }
        }
    }
    
    
    public class LookupStockPrice extends RichSourceFunction<String> { 
        public String stockSymbol = null; 
        public boolean isRunning = true; 
    
        public LookupStockPrice(String inSymbol) { 
                stockSymbol = inSymbol; 
        } 
    
        @Override 
        public void open(Configuration parameters) throws Exception { 
                isRunning = true; 
        } 
    
    
        @Override 
        public void cancel() { 
                isRunning = false; 
        } 
    
        @Override 
        public void run(SourceFunction.SourceContext<String> ctx) 
                        throws Exception { 
                String stockPrice = "0";
                while (isRunning) { 
                    //TODO: query Google Finance API 
                    stockPrice = Integer.toString((new Random()).nextInt(100)+1);
                    ctx.collect(stockPrice);
                    Thread.sleep(10000);
                } 
        } 
    }
    

1 个答案:

答案 0 :(得分:4)

StreamExecutionEnvironment不会缩进在流应用程序的运算符中使用。不是意图,这是没有测试和鼓励。它可能会起作用并做一些事情,但很可能表现不佳并且可能会杀死你的应用程序。

程序中的StockSymbolToPriceMapFunction为每个传入记录指定一个全新且独立的新流应用程序。但是,由于您没有调用streamExecEnv.execute(),因此程序未启动且map方法在不执行任何操作的情况下返回。

如果调用streamExecEnv.execute(),该函数将在workers JVM中启动一个新的本地Flink集群,并在此本地Flink集群上启动该应用程序。本地Flink实例将占用大量的堆空间,并且在启动几个集群之后,工作人员可能会因OutOfMemoryError而死,这不是您想要发生的。