在pyspark中为特定IP设置主会话的Sparksession

时间:2018-08-29 12:55:27

标签: python apache-spark pyspark

我正在尝试通过以下方式将我的本地ip连接到spark会话:-

spark = SparkSession.\
    builder.\
    master("spark://192.168.2.310:7077").\
    appName("new_h").\
    config("spark.executor.heartbeatInterval","60s").\
    config("spark.executor.cores","1").\
    config("spark.cores.max","2").\
    config("spark.driver.memory", "4g").\
    getOrCreate()

但它给我以下错误:-

> Py4JJavaError                             Traceback (most recent call last)
<ipython-input-4-9216c0a86a45> in <module>()
      6 import pandas as pd
      7 import numpy as np
----> 8 spark = SparkSession.        builder.        master("spark://192.168.5.220:7077").        appName("new_h").        config("spark.executor.heartbeatInterval","60s").        config("spark.executor.cores","1").        config("spark.cores.max","2").        config("spark.driver.memory", "4g").        getOrCreate()
      9 spark.stop()

~\Anaconda3\lib\site-packages\pyspark\sql\session.py in getOrCreate(self)
    171                     for key, value in self._options.items():
    172                         sparkConf.set(key, value)
--> 173                     sc = SparkContext.getOrCreate(sparkConf)
    174                     # This SparkContext may be an existing one.
    175                     for key, value in self._options.items():

~\Anaconda3\lib\site-packages\pyspark\context.py in getOrCreate(cls, conf)
    341         with SparkContext._lock:
    342             if SparkContext._active_spark_context is None:
--> 343                 SparkContext(conf=conf or SparkConf())
    344             return SparkContext._active_spark_context
    345 

~\Anaconda3\lib\site-packages\pyspark\context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    116         try:
    117             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
--> 118                           conf, jsc, profiler_cls)
    119         except:
    120             # If an error occurs, clean up in order to allow future SparkContext creation:

~\Anaconda3\lib\site-packages\pyspark\context.py in _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, jsc, profiler_cls)
    178 
    179         # Create the Java SparkContext through Py4J
--> 180         self._jsc = jsc or self._initialize_context(self._conf._jconf)
    181         # Reset the SparkConf to the one actually used by the SparkContext in JVM.
    182         self._conf = SparkConf(_jconf=self._jsc.sc().conf())

~\Anaconda3\lib\site-packages\pyspark\context.py in _initialize_context(self, jconf)
    280         Initialize SparkContext in function to allow subclass specific initialization
    281         """
--> 282         return self._jvm.JavaSparkContext(jconf)
    283 
    284     @classmethod

~\Anaconda3\lib\site-packages\py4j\java_gateway.py in __call__(self, *args)
   1523         answer = self._gateway_client.send_command(command)
   1524         return_value = get_return_value(
-> 1525             answer, self._gateway_client, None, self._fqn)
   1526 
   1527         for temp_arg in temp_args:

~\Anaconda3\lib\site-packages\py4j\protocol.py in get_return_value(answer, gateway_client, target_id, name)
    326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
--> 328                     format(target_id, ".", name), value)
    329             else:
    330                 raise Py4JError(

Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.NullPointerException
    at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)
    at org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:241)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:238)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:748)

我知道我可以通过

进行连接
  

spark = SparkSession。\               建造者。\               master(“ local [*]”)。\               appName(“ new_h”)。\               config(“ spark.executor.heartbeatInterval”,“ 60s”)。\               config(“ spark.executor.cores”,“ 1”)。\               config(“ spark.cores.max”,“ 2”)。\               config(“ spark.driver.memory”,“ 4g”)。\               getOrCreate()

但是我想通过IP连接。 有人能告诉我我的错误吗?

0 个答案:

没有答案