我可以成功运行java版本的pi示例,如下所示。
./bin/spark-submit --class org.apache.spark.examples.SparkPi \
--master yarn-client \
--num-executors 3 \
--driver-memory 4g \
--executor-memory 2g \
--executor-cores 1 \
--queue thequeue \
lib/spark-examples*.jar \
10
但是,python版本失败,并显示以下错误信息。我使用了纱线客户端模式。具有yarn-client模式的pyspark命令行返回相同的信息。任何人都可以帮我解决这个问题吗?
nlp@yyy2:~/spark$ ./bin/spark-submit --master yarn-client examples/src/main/python/pi.py
15/01/05 17:22:26 INFO spark.SecurityManager: Changing view acls to: nlp
15/01/05 17:22:26 INFO spark.SecurityManager: Changing modify acls to: nlp
15/01/05 17:22:26 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nlp); users with modify permissions: Set(nlp)
15/01/05 17:22:26 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/01/05 17:22:26 INFO Remoting: Starting remoting
15/01/05 17:22:26 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@yyy2:42747]
15/01/05 17:22:26 INFO util.Utils: Successfully started service 'sparkDriver' on port 42747.
15/01/05 17:22:26 INFO spark.SparkEnv: Registering MapOutputTracker
15/01/05 17:22:26 INFO spark.SparkEnv: Registering BlockManagerMaster
15/01/05 17:22:26 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20150105172226-aeae
15/01/05 17:22:26 INFO storage.MemoryStore: MemoryStore started with capacity 265.1 MB
15/01/05 17:22:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/01/05 17:22:27 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-cbe0079b-79c5-426b-b67e-548805423b11
15/01/05 17:22:27 INFO spark.HttpServer: Starting HTTP Server
15/01/05 17:22:27 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/01/05 17:22:27 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:57169
15/01/05 17:22:27 INFO util.Utils: Successfully started service 'HTTP file server' on port 57169.
15/01/05 17:22:27 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/01/05 17:22:27 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/01/05 17:22:27 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
15/01/05 17:22:27 INFO ui.SparkUI: Started SparkUI at http://yyy2:4040
15/01/05 17:22:27 INFO client.RMProxy: Connecting to ResourceManager at yyy14/10.112.168.195:8032
15/01/05 17:22:27 INFO yarn.Client: Requesting a new application from cluster with 6 NodeManagers
15/01/05 17:22:27 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
15/01/05 17:22:27 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
15/01/05 17:22:27 INFO yarn.Client: Setting up container launch context for our AM
15/01/05 17:22:27 INFO yarn.Client: Preparing resources for our AM container
15/01/05 17:22:28 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 24 for xxx on ha-hdfs:hzdm-cluster1
15/01/05 17:22:28 INFO yarn.Client: Uploading resource file:/home/nlp/platform/spark-1.2.0-bin-2.5.2/lib/spark-assembly-1.2.0-hadoop2.5.2.jar -> hdfs://hzdm-cluster1/user/nlp/.sparkStaging/application_1420444011562_0023/spark-assembly-1.2.0-hadoop2.5.2.jar
15/01/05 17:22:29 INFO yarn.Client: Uploading resource file:/home/nlp/platform/spark-1.2.0-bin-2.5.2/examples/src/main/python/pi.py -> hdfs://hzdm-cluster1/user/nlp/.sparkStaging/application_1420444011562_0023/pi.py
15/01/05 17:22:29 INFO yarn.Client: Setting up the launch environment for our AM container
15/01/05 17:22:29 INFO spark.SecurityManager: Changing view acls to: nlp
15/01/05 17:22:29 INFO spark.SecurityManager: Changing modify acls to: nlp
15/01/05 17:22:29 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nlp); users with modify permissions: Set(nlp)
15/01/05 17:22:29 INFO yarn.Client: Submitting application 23 to ResourceManager
15/01/05 17:22:30 INFO impl.YarnClientImpl: Submitted application application_1420444011562_0023
15/01/05 17:22:31 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED)
15/01/05 17:22:31 INFO yarn.Client:
client token: Token { kind: YARN_CLIENT_TOKEN, service: }
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.default
start time: 1420449749969
final status: UNDEFINED
tracking URL: http://yyy14:8070/proxy/application_1420444011562_0023/
user: nlp
15/01/05 17:22:32 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED)
15/01/05 17:22:33 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED)
15/01/05 17:22:34 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED)
15/01/05 17:22:35 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED)
15/01/05 17:22:36 INFO yarn.Client: Application report for application_1420444011562_0023 (state: ACCEPTED)
15/01/05 17:22:36 INFO cluster.YarnClientSchedulerBackend: ApplicationMaster registered as Actor[akka.tcp://sparkYarnAM@yyy16:52855/user/YarnAM#435880073]
15/01/05 17:22:36 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> yyy14, PROXY_URI_BASES -> http://yyy14:8070/proxy/application_1420444011562_0023), /proxy/application_1420444011562_0023
15/01/05 17:22:36 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15/01/05 17:22:37 INFO yarn.Client: Application report for application_1420444011562_0023 (state: RUNNING)
15/01/05 17:22:37 INFO yarn.Client:
client token: Token { kind: YARN_CLIENT_TOKEN, service: }
diagnostics: N/A
ApplicationMaster host: yyy16
ApplicationMaster RPC port: 0
queue: root.default
start time: 1420449749969
final status: UNDEFINED
tracking URL: http://yyy14:8070/proxy/application_1420444011562_0023/
user: nlp
15/01/05 17:22:37 INFO cluster.YarnClientSchedulerBackend: Application application_1420444011562_0023 has started running.
15/01/05 17:22:37 INFO netty.NettyBlockTransferService: Server created on 35648
15/01/05 17:22:37 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/01/05 17:22:37 INFO storage.BlockManagerMasterActor: Registering block manager yyy2:35648 with 265.1 MB RAM, BlockManagerId(<driver>, yyy2, 35648)
15/01/05 17:22:37 INFO storage.BlockManagerMaster: Registered BlockManager
15/01/05 17:22:37 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM@yyy16:52855] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
15/01/05 17:22:38 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/kill,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/static,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump/json,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/threadDump,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors/json,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/executors,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment/json,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/environment,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd/json,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/rdd,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage/json,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/storage,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool/json,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/pool,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage/json,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/stage,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages/json,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/stages,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job/json,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/job,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs/json,null}
15/01/05 17:22:38 INFO handler.ContextHandler: stopped o.e.j.s.ServletContextHandler{/jobs,null}
15/01/05 17:22:38 INFO ui.SparkUI: Stopped Spark web UI at http://yyy2:4040
15/01/05 17:22:38 INFO scheduler.DAGScheduler: Stopping DAGScheduler
15/01/05 17:22:38 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
15/01/05 17:22:38 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
15/01/05 17:22:38 INFO cluster.YarnClientSchedulerBackend: Stopped
15/01/05 17:22:39 INFO spark.MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
15/01/05 17:22:39 INFO storage.MemoryStore: MemoryStore cleared
15/01/05 17:22:39 INFO storage.BlockManager: BlockManager stopped
15/01/05 17:22:39 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
15/01/05 17:22:39 INFO spark.SparkContext: Successfully stopped SparkContext
15/01/05 17:22:39 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/01/05 17:22:39 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/01/05 17:22:39 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/01/05 17:22:57 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
Traceback (most recent call last):
File "/home/nlp/platform/spark-1.2.0-bin-2.5.2/examples/src/main/python/pi.py", line 29, in <module>
sc = SparkContext(appName="PythonPi")
File "/home/nlp/spark/python/pyspark/context.py", line 105, in __init__
conf, jsc)
File "/home/nlp/spark/python/pyspark/context.py", line 153, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
File "/home/nlp/spark/python/pyspark/context.py", line 201, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
File "/home/nlp/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 701, in __call__
File "/home/nlp/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NullPointerException
at org.apache.spark.SparkContext.<init>(SparkContext.scala:497)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
答案 0 :(得分:6)
如果您在Java 8上运行此示例,这可能是由于Java 8的内存分配策略过多:https://issues.apache.org/jira/browse/YARN-4714
您可以通过在yarn-site.xml
中设置以下属性来强制YARN忽略此项<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
答案 1 :(得分:2)
尝试使用deploy mode参数,如下所示:
--deploy-mode cluster
我有像你这样的问题,这个参数很有效。
答案 2 :(得分:1)
我使用spark-submit和yarn-client遇到了类似的问题(我得到了相同的NPE / stacktrace)。调低我的记忆设置就可以了。当你试图分配太多内存时,似乎会失败。我首先要删除--executor-memory
和--driver-memory
开关。
答案 3 :(得分:1)
我减少了高级spark-env中的核心数量,使其正常工作。
答案 4 :(得分:0)
我遇到了这个问题(hdp 2.3 spark 1.3.1)
nrows
我的解决方案是设置spark配置值:
def getLastRow(sh, col_num = None):
"""
;param sh: Excel worksheet
;param col_num: long/integer identifying which column to query
"""
if not col_num == None:
return sh.nrows
for r in range(sh.nrows,1,-1):
if str(sh.col_values(col_num)[r-1]).strip() == '':
print 'cell({},{}) contains an empty value or trims to empty string'.format(r, col_num)
else:
return r