问题:有一个JMS主题,我在关注以下连接详细信息: 网址:xyz 连接工厂:jms.xyz 主题名称:jms.xyz 用户名: 密码:
是否存在可以在Spark Scala中创建订阅者并使用来自主题的JMS消息的工作代码?
我尝试过使用Spark Streaming的socketTextStream函数,但是它只有URL参数。 我正在寻找一个可以具有我所有5个参数的Spark Streaming函数: 1)网址 2)连接工厂 3)主题名称 4)用户名 5)密码
我尝试在Spark-Shell中运行
我正在寻找一个火花流功能,该功能可以具有我所有的5个参数以及一个有效的火花标量代码,该代码可以使用该主题的JMS消息: 1)网址 2)连接工厂 3)主题名称 4)用户名 5)密码
我正在寻找可以逐行执行的Spark-Shell基本命令
答案 0 :(得分:1)
问:我们可以通过Spark Streaming消耗主题中的JMS消息吗?
是的。 AFAIK没有解决方案。
基于消息提供程序的实现可能有所不同。 您可能需要为此写一个custom receiver from spark docs。
请参见示例2,该示例将jms主题集成与火花流一起使用
示例1 (source):
import org.apache.log4j.Logger;
import org.apache.spark.storage.StorageLevel;
import org.apache.spark.streaming.receiver.Receiver;
import javax.jms.*;
import javax.naming.Context;
import java.util.Hashtable;
public class JMSReceiver extends Receiver<JMSEvent> implements MessageListener
{
private static final Logger log = Logger.getLogger(JMSReceiver.class);
private static final String JNDI_INITIAL_CONTEXT_FACTORY = "org.apache.qpid.jms.jndi.JmsInitialContextFactory";
private static final String JNDI_CONNECTION_FACTORY_NAME = "JMSReceiverConnectionFactory";
private static final String JNDI_QUEUE_NAME = "JMSReceiverQueue";
private static final String JNDI_CONNECTION_FACTORY_KEY_PREFIX = "connectionfactory.";
private static final String JNDI_QUEUE_KEY_PREFIX = "queue.";
private StorageLevel _storageLevel;
private String _brokerURL;
private String _username;
private String _password;
private String _queueName;
private String _selector;
private Connection _connection;
public JMSReceiver(String brokerURL, String username, String password, String queueName, String selector, StorageLevel storageLevel)
{
super(storageLevel);
_storageLevel = storageLevel;
_brokerURL = brokerURL;
_username = username;
_password = password;
_queueName = queueName;
_selector = selector;
log.info("Constructed" + this);
}
@Override
public void onMessage(Message message)
{
try
{
log.info("Received: " + message);
JMSEvent jmsEvent = new JMSEvent(message);
store(jmsEvent);
} catch (Exception exp)
{
log.error("Caught exception converting JMS message to JMSEvent", exp);
}
}
@Override
public StorageLevel storageLevel()
{
return _storageLevel;
}
public void onStart()
{
log.info("Starting up...");
try
{
Hashtable<Object, Object> env = new Hashtable<Object, Object>();
env.put(Context.INITIAL_CONTEXT_FACTORY, JNDI_INITIAL_CONTEXT_FACTORY);
env.put(JNDI_CONNECTION_FACTORY_KEY_PREFIX + JNDI_CONNECTION_FACTORY_NAME, _brokerURL);
env.put(JNDI_QUEUE_KEY_PREFIX + JNDI_QUEUE_NAME, _queueName);
javax.naming.Context context = new javax.naming.InitialContext(env);
ConnectionFactory factory = (ConnectionFactory) context.lookup(JNDI_CONNECTION_FACTORY_NAME);
Destination queue = (Destination) context.lookup(JNDI_QUEUE_NAME);
if ((_username == null) || (_password == null))
{
_connection = factory.createConnection();
} else
{
_connection = factory.createConnection(_username, _password);
}
_connection.setExceptionListener(new JMSReceiverExceptionListener());
Session session = _connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
MessageConsumer messageConsumer;
if (_selector != null)
{
messageConsumer = session.createConsumer(queue, _selector);
} else
{
messageConsumer = session.createConsumer(queue);
}
messageConsumer.setMessageListener(this);
_connection.start();
log.info("Completed startup.");
} catch (Exception exp)
{
// Caught exception, try a restart
log.error("Caught exception in startup", exp);
restart("Caught exception, restarting.", exp);
}
}
public void onStop()
{
// Cleanup stuff (stop threads, close sockets, etc.) to stop receiving data
log.info("Stopping...");
try
{
_connection.close();
} catch (JMSException exp)
{
log.error("Caught exception stopping", exp);
}
log.info("Stopped.");
}
private class JMSReceiverExceptionListener implements ExceptionListener
{
@Override
public void onException(JMSException exp)
{
log.error("Connection ExceptionListener fired, attempting restart.", exp);
restart("Connection ExceptionListener fired, attempting restart.");
}
}
@Override
public String toString()
{
return "JMSReceiver{" +
"brokerURL='" + _brokerURL + '\'' +
", username='" + _username + '\'' +
", password='" + _password + '\'' +
", queueName='" + _queueName + '\'' +
", selector='" + _selector + '\'' +
'}';
}
}
您的JMSInputDstream看起来像
import org.apache.spark.storage.StorageLevel
import org.apache.spark.streaming._
import org.apache.spark.streaming.dstream._
import org.apache.spark.streaming.receiver.Receiver
private[streaming]
class JMSInputDStream(
@transient ssc_ : StreamingContext,
brokerURL: String,
username: String,
password: String,
queuename: String,
selector: String,
storageLevel: StorageLevel
) extends ReceiverInputDStream[JMSEvent](ssc_) {
override def getReceiver(): Receiver[JMSEvent] = {
new JMSReceiver(brokerURL, username, password, queuename, selector, storageLevel)
}
}
示例2使用activemq和JmsTopicReceiver.scala:
import org.apache.spark.Logging
import org.apache.spark.storage.StorageLevel
import javax.{jms => jms}
/** Simple class of a receiver that can be run on worker nodes to receive the data from JMS Topic.
*
* In JMS a Topic implements publish and subscribe semantics.
* When you publish a message it goes to all the subscribers who are interested - so zero to many subscribers will receive a copy of the message.
* Only subscribers who had an active subscription at the time the broker receives the message will get a copy of the message.
*
* {{{
* val sc: SparkContext = SparkContext.getOrCreate(conf)
* val ssc: StreamingContext = new StreamingContext(sc, Seconds(...))
*
* val stream: InputDStream[String] = ssc.receiverStream(new JmsTopicReceiver(
* topicName = "testTopic",
* transformer = { msg => msg.asInstanceOf[javax.jms.TextMessage].getText() },
* connectionProvider = { () => {
* val cf = new org.apache.activemq.ActiveMQConnectionFactory("tcp://localhost:61616")
* cf.setOptimizeAcknowledge(true)
* cf.createConnection("username", "password")
* }}
* ))
*
* ...
*
* ssc.start()
* ssc.awaitTermination()
* }}}
*
* @param connectionProvider provides <CODE>javax.jms.Connection</CODE> for the receiver.
* @param transformer (pre)transforms <CODE>javax.jms.Message</CODE> to appropriate class (it's required to do this before populate the result).
* @param topicName the name of required <CODE>javax.jms.Topic</CODE>.
* @param messageSelector only messages with properties matching the message selector expression are delivered.
* @param storageLevel flags for controlling the storage of an RDD.
* @tparam T RDD element type.
*/
class JmsTopicReceiver[T] (
connectionProvider: (() => jms.Connection),
transformer: (jms.Message => T),
topicName: String,
messageSelector: Option[String] = None,
storageLevel: StorageLevel = StorageLevel.MEMORY_AND_DISK_SER_2
) extends AbstractJmsReceiver[T](
messageSelector = messageSelector,
storageLevel = storageLevel
) with Logging {
override protected def buildConnection(): jms.Connection = connectionProvider()
override protected def transform(message: jms.Message): T = transformer(message)
override protected def buildDestination(session: jms.Session): jms.Destination = session.createTopic(topicName)
}
示例3: Solace使用了spark的自定义接收器:这样一来,当存在spark 1.3时,我就工作很久了,
Solace-JMS-Integration-Spark-Streaming.pdf
进一步阅读:Processing Data from MQ with Spark Streaming: Part 1 - Introduction to Messaging, JMS & MQ