py4j.protocol.Py4JError:调用o66.createStream时发生错误-Amazon Kinesis,pyspark

时间:2019-08-16 11:47:16

标签: python pyspark amazon-emr amazon-kinesis py4j

我正在尝试使用python中的Spark Streaming使用KinesisUtils软件包从Amazon Kinesis Data Stream读取数据,但出现错误。

from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kinesis import KinesisUtils, InitialPositionInStream

sc = SparkContext.getOrCreate()
ssc = StreamingContext(sc, 1)

#APPNAME,STREAMNAME,REGIONNAME,ENDPOINTURL,CHECKPOINT INTERVAL ARE CONSTANTS DEFINED HERE

kinesisStream = KinesisUtils.createStream(
ssc, APPNAME, STREAMNAME, ENDPOINTURL,
REGIONNAME, InitialPositionInStream.TRIM_HORIZON, CHECKPOINTINTERVAL, StorageLevel.MEMORY_AND_DISK_2)

kinesisStream.pprint()

ssc.start()
ssc.awaitTermination()

我使用以下命令在EMR上运行此命令 spark-submit --deploy-mode cluster --jars s3://bucket/spark-streaming-kinesis-asl_2.11-2.4.3.jar s3PathToMainPyFile

出现以下错误,

ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/mnt/yarn/usercache/hadoop/appcache/application_1565898995408_0003/container_1565898995408_0003_01_00001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command
    response = connection.send_command(command)
  File "/mnt/yarn/usercache/hadoop/appcache/application_1565898995408_0003/container_1565898995409_0003_01_00001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1164, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
Py4JNetworkError: Error while receiving
Traceback (most recent call last):
  File "spark_sentiment2.py", line 24, in <module>
    REGIONNAME, InitialPositionInStream.TRIM_HORIZON, CHECKPOINTINTERVAL, StorageLevel.MEMORY_AND_DISK_2)
  File "/mnt/yarn/usercache/hadoop/appcache/application_1565898995408_0003/container_1565898995408_0003_01_00001/pyspark.zip/pyspark/streaming/kinesis.py", line 92, in createStream
  File "/mnt/yarn/usercache/hadoop/appcache/application_1565898995408_0003/container_1565898995408_0003_01_00001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
  File "/mnt/yarn/usercache/hadoop/appcache/application_1565898995408_0003/container_15658989954080003_01_00001/py4j-0.10.7-src.zip/py4j/protocol.py", line 336, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling o66.createStream

0 个答案:

没有答案