为什么Spark无法在Eclipse上运行?

时间:2018-12-30 15:18:24

标签: java python-3.x eclipse hadoop pyspark

我已使用Python 3.7,JRE 8,JDK 1.8在Eclipse(Eclipse插件:PyDev)上安装了带有hadoop2.6的pysark2.1。

我正在尝试运行一个简单的测试代码:

from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()

但是出现以下错误:

  

使用Spark的默认log4j配置文件:
  org / apache / spark / log4j-defaults.properties设置默认日志级别   警告”。要调整日志记录级别,请使用sc.setLogLevel(newLevel)。对于   SparkR,使用setLogLevel(newLevel)。 18/12/30 17:04:33错误   SparkUncaughtExceptionHandler:线程中未捕获的异常   Thread [main,5,main] java.util.NoSuchElementException:找不到键:   _PYSPARK_DRIVER_CALLBACK_HOST

     

在scala.collection.MapLike $ class.default(MapLike.scala:228)
  在scala.collection.AbstractMap.default(Map.scala:59)
  在scala.collection.MapLike $ class.apply(MapLike.scala:141)
  在scala.collection.AbstractMap.apply(Map.scala:59)
  在org.apache.spark.api.python.PythonGatewayServer $$ anonfun $ main $ 1.apply $ mcV $ sp(PythonGatewayServer.scala:50)   在org.apache.spark.util.Utils $ .tryOrExit(Utils.scala:1228)
  在org.apache.spark.api.python.PythonGatewayServer $ .main(PythonGatewayServer.scala:37)   在org.apache.spark.api.python.PythonGatewayServer.main(PythonGatewayServer.scala)   在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)
  在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)   在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)   在java.lang.reflect.Method.invoke(Method.java:498)
  在org.apache.spark.deploy.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:738)中   在org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:187)   在org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:212)   在org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:126)   在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

     

回溯(最近通话最近):

     

文件“ C:\ Users \ charfoush \ eclipse-workspace \ sample2 \ test2.py”,第7行,在   

spark = SparkSession.builder.getOrCreate()   
     

文件“ C:\ Users \ charfoush \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ pyspark \ sql \ session.py”,   第173行,位于getOrCreate

sc = SparkContext.getOrCreate(sparkConf)   
     

文件“ C:\ Users \ charfoush \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ pyspark \ context.py”,   第351行,位于getOrCreate

SparkContext(conf=conf or SparkConf())   
     

文件“ C:\ Users \ charfoush \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ pyspark \ context.py”,   第115行,在 init

SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)   
     

文件“ C:\ Users \ charfoush \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ pyspark \ context.py”,   第300行,_ensure_initialized

SparkContext._gateway = gateway or launch_gateway(conf)   
     

文件“ C:\ Users \ charfoush \ AppData \ Local \ Programs \ Python \ Python37-32 \ lib \ site-packages \ pyspark \ java_gateway.py”,   第93行,在launch_gateway中

raise Exception("Java gateway process exited before sending its port number") Exception: Java gateway process exited before sending
     

其端口号

2 个答案:

答案 0 :(得分:0)

例如,可能发生此问题:

  • 如果版本不匹配
  • 或者如果您没有正确定义SPARK_HOMEPYTHONPATH环境变量(请确保它们均未针对较旧的版本)

答案 1 :(得分:0)

<import resource="k8s.xml"/> <bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration"> <property name="failureHandler"> <bean class="org.apache.ignite.failure.NoOpFailureHandler"/> </property> <property name="binaryConfiguration"> <bean class="org.apache.ignite.configuration.BinaryConfiguration"> <property name="compactFooter" value="false"/> </bean> </property> <property name="clientFailureDetectionTimeout" value="3600000"/> <property name="failureDetectionTimeout" value="3610000"/> <property name="systemWorkerBlockedTimeout" value="3600000"/> <property name="publicThreadPoolSize" value="12"/> <property name="systemThreadPoolSize" value="12"/> <property name="queryThreadPoolSize" value="12"/> <property name="serviceThreadPoolSize" value="12"/> <property name="dataStreamerThreadPoolSize" value="12"/> <property name="stripedPoolSize" value="12"/> <property name="executorConfiguration"> <list> <bean class="org.apache.ignite.configuration.ExecutorConfiguration"> <property name="name" value="eventThetaThreadPool"/> <property name="size" value="32"/> </bean> </list> </property> <property name="dataStorageConfiguration"> <bean class="org.apache.ignite.configuration.DataStorageConfiguration"> <property name="metricsEnabled" value="true"/> <property name="checkpointFrequency" value="300000"/> <property name="storagePath" value="/var/lib/ignite/data/db"/> <property name="walFlushFrequency" value="10000"/> <property name="walMode" value="LOG_ONLY"/> <property name="walPath" value="/var/lib/ignite/data/wal"/> <property name="walArchivePath" value="/var/lib/ignite/data/wal/archive"/> <property name="walSegmentSize" value="2147483647"/> <property name="maxWalArchiveSize" value="4294967294"/> <property name="walCompactionEnabled" value="false"/> <property name="writeThrottlingEnabled" value="False"/> <property name="pageSize" value="4096"/> <property name="defaultDataRegionConfiguration"> <bean class="org.apache.ignite.configuration.DataRegionConfiguration"> <property name="persistenceEnabled" value="true"/> <property name="checkpointPageBufferSize" value="5242880"/> <property name="name" value="Default_Region"/> <property name="maxSize" value="61203283968"/> <property name="metricsEnabled" value="true"/> </bean> </property> <property name="dataRegionConfigurations"> <list> <bean class="org.apache.ignite.configuration.DataRegionConfiguration"> <property name="persistenceEnabled" value="true"/> <property name="checkpointPageBufferSize" value="5242880"/> <property name="name" value="OnDiskRegion"/> <property name="maxSize" value="#{10 * 1024 * 1024}"/> <property name="metricsEnabled" value="true"/> </bean> </list> </property> </bean> </property> <property name="peerClassLoadingEnabled" value="true"/> <property name="clientConnectorConfiguration"> <bean class="org.apache.ignite.configuration.ClientConnectorConfiguration"> <property name="port" value="10800"/> </bean> </property> <property name="communicationSpi"> <bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi"> <property name="localPort" value="47100"/> </bean> </property> <property name="discoverySpi" ref="tcp-discovery.cfg"/> <property name="connectorConfiguration"> <bean class="org.apache.ignite.configuration.ConnectorConfiguration"> <property name="port" value="11211"/> <property name="jettyPath" value="config/jetty.xml"/> </bean> </property> <property name="cacheKeyConfiguration"> <list> <bean class="org.apache.ignite.cache.CacheKeyConfiguration"> <property name="typeName" value="co.mira.etl.load.ignite.models.MyCacheKey"/> <property name="affinityKeyFieldName" value="parentS2CellId"/> </bean> </list> </property> <property name="cacheConfiguration"> <list> <bean class="org.apache.ignite.configuration.CacheConfiguration"> <property name="name" value="MyCache"/> <property name="queryParallelism" value="4"/> <property name="affinity"> <bean class="co.mira.etl.load.ignite.affinity.S2AffinityFunction"> <constructor-arg value="10"/> <property name="maxPartitions" value="64"/> </bean> </property> <property name="dataRegionName" value="OnDiskRegion"/> <property name="cacheMode" value="PARTITIONED"/> <property name="backups" value="0"/> <property name="sqlSchema" value="PUBLIC"/> <property name="statisticsEnabled" value="true"/> <property name="queryEntities"> <list> <bean class="org.apache.ignite.cache.QueryEntity"> <property name="keyType" value="co.mira.etl.load.ignite.models.MyCacheKey"/> <property name="valueType" value="co.mira.etl.load.ignite.models.MyCache"/> <property name="fields"> <map> <entry key="eventDate" value="java.sql.Timestamp"/> <entry key="s2CellId" value="java.lang.Long"/> <entry key="eventHour" value="java.lang.Byte"/> <entry key="parentS2CellId" value="java.lang.Long"/> <entry key="theta" value="[B"/> </map> </property> <property name="keyFields"> <set> <value>eventDate</value> <value>s2CellId</value> <value>eventHour</value> <value>parentS2CellId</value> </set> </property> <property name="indexes"> <list> <bean class="org.apache.ignite.cache.QueryIndex"> <constructor-arg> <list> <value>eventDate</value> <value>s2CellId</value> <value>eventHour</value> </list> </constructor-arg> <constructor-arg value="SORTED"/> </bean> </list> </property> </bean> </list> </property> </bean> </list> </property> </bean> 为我工作 https://blog.puneethabm.com/pyspark-dev-set-up-eclipse-windows/ enter link description here