pysparkling

时间:2017-04-16 18:02:13

标签: h2o sparkling-water

我是Pysparkling的新人。我使用的是yarn集群,Spark 1.6,Cloudera CDH 5.8.0,python 2.7.6,我对hc=H2OContext.getOrCreate(sc)有疑问。你有什么想法吗?

from pysparkling import * import h2o hc = H2OContext.getOrCreate(sc) 

17/04/16 17:13:59 INFO spark.SparkContext: Added JAR /root/.cache/Python-Eggs/h2o_pysparkling_1.6-1.6.10-py2.7.eg g-tmp/sparkling_water/sparkling_water_assembly.jar at spark://147.232.202.114:47251/jars/sparkling_water_assembly .jar with timestamp 1492355639066 
17/04/16 17:13:59 WARN internal.InternalH2OBackend: Increasing 'spark.locality.wait' to value 30000 17/04/16 17:13:59 WARN internal.InternalH2OBackend: Due to non-deterministic behavior of Spark broadcast-based jo ins We recommend to disable them by configuring spark.sql.autoBroadcastJoinThreshold variable to value -1: sqlContext.sql("SET spark.sql.autoBroadcastJoinThreshold=-1") 
17/04/16 17:13:59 WARN internal.InternalH2OBackend: The property 'spark.scheduler.minRegisteredResourcesRatio' is not specified! We recommend to pass --conf spark.scheduler.minRegisteredResourcesRatio=1 
17/04/16 17:13:59 WARN internal.InternalH2OBackend: Unsupported options spark.dynamicAllocation.enabled detected! 
17/04/16 17:13:59 WARN internal.InternalH2OBackend: The application is going down, since the parameter (spark.ext.h2o.fail.on.unsupported.spark.param,true) is true! If you would like to skip the fail call, please, specify the value of the parameter to false.

Traceback (most recent call last): File "", line 1, in File "build/bdist.linux-x86_64/egg/pysparkling/context.py", line 128, in getOrCreate File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway. py", line 813, in call File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/pyspark/sql/utils.py", line 45, in deco return f(a, *kw) File "/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o54.invoke. : java.lang.IllegalArgumentException: Unsupported argument: (spark.dynamicAllocation.enabled,true) at org.apache.spark.h2o.backends.internal.InternalBackendUtils$$anonfun$checkUnsupportedSparkOptions$1.ap ply(InternalBackendUtils.scala:48) at org.apache.spark.h2o.backends.internal.InternalBackendUtils$$anonfun$checkUnsupportedSparkOptions$1.ap ply(InternalBackendUtils.scala:40) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.h2o.backends.internal.InternalBackendUtils$class.checkUnsupportedSparkOptions(Interna lBackendUtils.scala:40) at org.apache.spark.h2o.backends.internal.InternalH2OBackend.checkUnsupportedSparkOptions(InternalH2OBack end.scala:31) at org.apache.spark.h2o.backends.internal.InternalH2OBackend.checkAndUpdateConf(InternalH2OBackend.scala: 61) at org.apache.spark.h2o.H2OContext.(H2OContext.scala:96) at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:294) at org.apache.spark.h2o.H2OContext.getOrCreate(H2OContext.scala) at org.apache.spark.h2o.JavaH2OContext.getOrCreate(JavaH2OContext.java:191) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at py4j.Gateway.invoke(Gateway.java:259) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:209) at java.lang.Thread.run(Thread.java:745)

1 个答案:

答案 0 :(得分:0)

这可以通过使用以下命令运行程序来解决,该命令已使用spark 1.6H2O version > 3.0

进行了测试
bin/pysparkling h2o_spark.py --conf spark.ext.h2o.fail.on.unsupported.spark.param=false