我想使用Spark Streaming从kafka读取binlog数据,使用canal(使用protobuf-2.4.1)收集binlog数据,我必须在Spark中使用protobuf-2.5.0流媒体环境。现在我得到了以下例外
16/07/11 15:13:01 ERROR yarn.ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 : java.lang.RuntimeException: Unable to find proto buffer class
at com.google.protobuf.GeneratedMessageLite$SerializedForm.readResolve(GeneratedMessageLite.java:775)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1104)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1807)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at com.data.binlog.BinlogEntryUtil.deserializeFromProtoBuf(BinlogEntryUtil.java:30)
at main.com.data.scala.Utils$.binlogDecode(Utils.scala:30)
at main.com.data.scala.IntegrateKafka$$anonfun$main$4.apply(IntegrateKafka.scala:37)
at main.com.data.scala.IntegrateKafka$$anonfun$main$4.apply(IntegrateKafka.scala:37)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at main.com.data.scala.IntegrateKafka$$anonfun$main$5$$anonfun$apply$2$$anonfun$apply$3.apply(IntegrateKafka.scala:42)
at main.com.data.scala.IntegrateKafka$$anonfun$main$5$$anonfun$apply$2$$anonfun$apply$3.apply(IntegrateKafka.scala:42)
at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
at main.com.data.scala.IntegrateKafka$.logInfo(IntegrateKafka.scala:16)
at main.com.data.scala.IntegrateKafka$$anonfun$main$5$$anonfun$apply$2.apply(IntegrateKafka.scala:42)
at main.com.data.scala.IntegrateKafka$$anonfun$main$5$$anonfun$apply$2.apply(IntegrateKafka.scala:39)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:898)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:898)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1850)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1850)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.alibaba.otter.canal.protocol.CanalEntry$Entry
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at com.google.protobuf.GeneratedMessageLite$SerializedForm.readResolve(GeneratedMessageLite.java:768)
... 29 more
代码com.data.binlog.BinlogEntryUtil.deserializeFromProtoBuf就在这里
public static Entry deserializeFromProtoBuf(byte[] input) {
Entry entry = null;
try {
ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(input));
entry = (Entry)ois.readObject();
} catch (ClassNotFoundException e) {
logger.error("Exception:" + e);
} catch (IOException e) {
logger.error("IOException " + e);
}
return entry;
}
但我在jar中找到了CanalEntry $ Entry.class
-rw ---- 2.0 fat 14431 bl defN 16-Jul-11 15:11 com / alibaba / otter / canal / protocol / CanalEntry $ Entry.class
我尝试使用CanalEntry.java
生成CanalPacket.java
和protoc-2.5.0
,但得到了相同的例外:java.lang.ClassNotFoundException: com.alibaba.otter.canal.protocol.CanalEntry$Entry
有人可以使用protobuf-2.4.1
给我一些建议来阅读binlog数据(由protobuf-2.5.0
序列化)吗?
感谢
答案 0 :(得分:0)
经过多次尝试,我发现这不是protobuf的问题 我遇到了这个问题,因为binlog数据不是纯protobuf字节,它们是包含protobuf字节的其他类的序列化字节。