我们可以通过Spark Streaming消耗主题中的JMS消息吗?

时间:2019-05-30 20:08:33

标签: scala apache-spark spark-streaming apache-kafka-streams

问题:有一个JMS主题,我在关注以下连接详细信息: 网址:xyz 连接工厂:jms.xyz 主题名称:jms.xyz 用户名: 密码:

是否存在可以在Spark Scala中创建订阅者并使用来自主题的JMS消息的工作代码?

我尝试过使用Spark Streaming的socketTextStream函数,但是它只有URL参数。 我正在寻找一个可以具有我所有5个参数的Spark Streaming函数: 1)网址 2)连接工厂 3)主题名称 4)用户名 5)密码

我尝试在Spark-Shell中运行

我正在寻找一个火花流功能,该功能可以具有我所有的5个参数以及一个有效的火花标量代码,该代码可以使用该主题的JMS消息: 1)网址 2)连接工厂 3)主题名称 4)用户名 5)密码

我正在寻找可以逐行执行的Spark-Shell基本命令

1 个答案:

答案 0 :(得分:1)

  

问:我们可以通过Spark Streaming消耗主题中的JMS消息吗?

是的。 AFAIK没有解决方案。

基于消息提供程序的实现可能有所不同。 您可能需要为此写一个custom receiver from spark docs

请参见示例2,该示例将jms主题集成与火花流一起使用

示例1 source):

import org.apache.log4j.Logger;
import org.apache.spark.storage.StorageLevel;
import org.apache.spark.streaming.receiver.Receiver;

import javax.jms.*;
import javax.naming.Context;
import java.util.Hashtable;

public class JMSReceiver extends Receiver<JMSEvent> implements MessageListener
{
    private static final Logger log = Logger.getLogger(JMSReceiver.class);

    private static final String JNDI_INITIAL_CONTEXT_FACTORY       = "org.apache.qpid.jms.jndi.JmsInitialContextFactory";
    private static final String JNDI_CONNECTION_FACTORY_NAME       = "JMSReceiverConnectionFactory";
    private static final String JNDI_QUEUE_NAME                    = "JMSReceiverQueue";
    private static final String JNDI_CONNECTION_FACTORY_KEY_PREFIX = "connectionfactory.";
    private static final String JNDI_QUEUE_KEY_PREFIX              = "queue.";

    private StorageLevel _storageLevel;

    private String _brokerURL;
    private String _username;
    private String _password;
    private String _queueName;
    private String _selector;

    private Connection _connection;

    public JMSReceiver(String brokerURL, String username, String password, String queueName, String selector, StorageLevel storageLevel)
    {
        super(storageLevel);
        _storageLevel = storageLevel;
        _brokerURL = brokerURL;
        _username = username;
        _password = password;
        _queueName = queueName;
        _selector = selector;

        log.info("Constructed" + this);
    }

    @Override
    public void onMessage(Message message)
    {
        try
        {
            log.info("Received: " + message);
            JMSEvent jmsEvent = new JMSEvent(message);
            store(jmsEvent);
        } catch (Exception exp)
        {
            log.error("Caught exception converting JMS message to JMSEvent", exp);
        }
    }

    @Override
    public StorageLevel storageLevel()
    {
        return _storageLevel;
    }

    public void onStart()
    {

        log.info("Starting up...");

        try
        {

            Hashtable<Object, Object> env = new Hashtable<Object, Object>();
            env.put(Context.INITIAL_CONTEXT_FACTORY, JNDI_INITIAL_CONTEXT_FACTORY);
            env.put(JNDI_CONNECTION_FACTORY_KEY_PREFIX + JNDI_CONNECTION_FACTORY_NAME, _brokerURL);
            env.put(JNDI_QUEUE_KEY_PREFIX + JNDI_QUEUE_NAME, _queueName);
            javax.naming.Context context = new javax.naming.InitialContext(env);

            ConnectionFactory factory = (ConnectionFactory) context.lookup(JNDI_CONNECTION_FACTORY_NAME);
            Destination queue = (Destination) context.lookup(JNDI_QUEUE_NAME);

            if ((_username == null) || (_password == null))
            {
                _connection = factory.createConnection();
            } else
            {
                _connection = factory.createConnection(_username, _password);
            }
            _connection.setExceptionListener(new JMSReceiverExceptionListener());

            Session session = _connection.createSession(false, Session.AUTO_ACKNOWLEDGE);

            MessageConsumer messageConsumer;

            if (_selector != null)
            {
                messageConsumer = session.createConsumer(queue, _selector);
            } else
            {
                messageConsumer = session.createConsumer(queue);
            }
            messageConsumer.setMessageListener(this);

            _connection.start();

            log.info("Completed startup.");
        } catch (Exception exp)
        {
            // Caught exception, try a restart
            log.error("Caught exception in startup", exp);
            restart("Caught exception, restarting.", exp);
        }
    }

    public void onStop()
    {
        // Cleanup stuff (stop threads, close sockets, etc.) to stop receiving data

        log.info("Stopping...");
        try
        {
            _connection.close();
        } catch (JMSException exp)
        {
            log.error("Caught exception stopping", exp);
        }
        log.info("Stopped.");
    }

    private class JMSReceiverExceptionListener implements ExceptionListener
    {
        @Override
        public void onException(JMSException exp)
        {
            log.error("Connection ExceptionListener fired, attempting restart.", exp);
            restart("Connection ExceptionListener fired, attempting restart.");
        }
    }

    @Override
    public String toString()
    {
        return "JMSReceiver{" +
                "brokerURL='" + _brokerURL + '\'' +
                ", username='" + _username + '\'' +
                ", password='" + _password + '\'' +
                ", queueName='" + _queueName + '\'' +
                ", selector='" + _selector + '\'' +
                '}';
    }
}

您的JMSInputDstream看起来像

import org.apache.spark.storage.StorageLevel
import org.apache.spark.streaming._
import org.apache.spark.streaming.dstream._
import org.apache.spark.streaming.receiver.Receiver

private[streaming]
class JMSInputDStream(
                       @transient ssc_ : StreamingContext,
                       brokerURL: String,
                       username: String,
                       password: String,
                       queuename: String,
                       selector: String,
                       storageLevel: StorageLevel
                       ) extends ReceiverInputDStream[JMSEvent](ssc_) {

  override def getReceiver(): Receiver[JMSEvent] = {
    new JMSReceiver(brokerURL, username, password, queuename, selector, storageLevel)
  }
}

示例2使用activemqJmsTopicReceiver.scala

import org.apache.spark.Logging
import org.apache.spark.storage.StorageLevel
import javax.{jms => jms}

/** Simple class of a receiver that can be run on worker nodes to receive the data from JMS Topic.
  *
  * In JMS a Topic implements publish and subscribe semantics.
  * When you publish a message it goes to all the subscribers who are interested - so zero to many subscribers will receive a copy of the message.
  * Only subscribers who had an active subscription at the time the broker receives the message will get a copy of the message.
  *
  * {{{
  *  val sc: SparkContext = SparkContext.getOrCreate(conf)
  *  val ssc: StreamingContext = new StreamingContext(sc, Seconds(...))
  *
  *  val stream: InputDStream[String] = ssc.receiverStream(new JmsTopicReceiver(
  *    topicName = "testTopic",
  *    transformer = { msg => msg.asInstanceOf[javax.jms.TextMessage].getText() },
  *    connectionProvider = { () => {
  *      val cf = new org.apache.activemq.ActiveMQConnectionFactory("tcp://localhost:61616")
  *      cf.setOptimizeAcknowledge(true)
  *      cf.createConnection("username", "password")
  *    }}
  *  ))
  *
  *  ...
  *
  *  ssc.start()
  *  ssc.awaitTermination()
  * }}}
  *
  * @param connectionProvider provides <CODE>javax.jms.Connection</CODE> for the receiver.
  * @param transformer (pre)transforms <CODE>javax.jms.Message</CODE> to appropriate class (it's required to do this before populate the result).
  * @param topicName the name of required <CODE>javax.jms.Topic</CODE>.
  * @param messageSelector only messages with properties matching the message selector expression are delivered.
  * @param storageLevel flags for controlling the storage of an RDD.
  * @tparam T RDD element type.
  */
class JmsTopicReceiver[T] (
  connectionProvider: (() => jms.Connection),
  transformer: (jms.Message => T),
  topicName: String,
  messageSelector: Option[String] = None,
  storageLevel: StorageLevel = StorageLevel.MEMORY_AND_DISK_SER_2
) extends AbstractJmsReceiver[T](
  messageSelector = messageSelector,
  storageLevel = storageLevel
) with Logging {

  override protected def buildConnection(): jms.Connection = connectionProvider()
  override protected def transform(message: jms.Message): T = transformer(message)
  override protected def buildDestination(session: jms.Session): jms.Destination = session.createTopic(topicName)

}

示例3: Solace使用了spark的自定义接收器:这样一来,当存在spark 1.3时,我就工作很久了,

Solace-JMS-Integration-Spark-Streaming.pdf

进一步阅读:Processing Data from MQ with Spark Streaming: Part 1 - Introduction to Messaging, JMS & MQ