我正在Cloudera集群(Cloudera Express 5.15.1)上运行Oozie工作流程。我的工作流程中的一项操作是用Python编写的Spark作业。该工作流过去长期运行平稳,然后几天前,使用此堆栈跟踪开始在Spark作业中崩溃:
18/11/02 06:00:30 INFO ApplicationMaster: Waiting for spark context initialization ...
18/11/02 06:00:30 ERROR ApplicationMaster: User class threw exception: java.lang.NoClassDefFoundError: py4j/GatewayServer
java.lang.NoClassDefFoundError: py4j/GatewayServer
at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:49)
at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:543)
Caused by: java.lang.ClassNotFoundException: py4j.GatewayServer
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
我该如何诊断和解决此问题?