Spark - 如果设置了主参数,则无法访问S3文件

时间:2016-08-01 21:16:59

标签: python apache-spark pyspark rdd

当我启动指定--master参数的作业时,我在访问AWS S3上的文件时遇到问题 - 作为终端中的参数

spark-submit --master spark://myIP:port

或实际代码

conf = SparkConf().setAppName("test").setMaster("spark://myIP:port")

如果我不添加该参数,它的工作正常,......我在那里错过了什么?

BTW我正在使用pyspark。

提前致谢!

请参阅下面的错误日志:

16/08/01 21:22:57 ERROR TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job
Traceback (most recent call last):
  File "/home/user/s3.py", line 14, in <module>
    print(rdd.take(1))
  File "/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 1310, in take
  File "/spark/python/lib/pyspark.zip/pyspark/context.py", line 941, in runJob
  File "/spark/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py", line 933, in __call__
  File "/spark/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py", line 312, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 10.0.128.1): org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.ServiceException: Request Error: java.security.ProviderException: java.security.KeyException
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:478)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:427)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:427)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:427)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:181)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at org.apache.hadoop.fs.s3native.$Proxy8.retrieveMetadata(Unknown Source)
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:468)
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem.open(NativeS3FileSystem.java:611)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)
    at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:108)
    at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
    at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:246)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:209)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:102)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
    at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
    at org.apache.spark.scheduler.Task.run(Task.scala:85)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.jets3t.service.ServiceException: Request Error: java.security.ProviderException: java.security.KeyException
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:623)
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:277)
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestHead(RestStorageService.java:1038)
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2250)
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectDetailsImpl(RestStorageService.java:2179)
    at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:1120)
    at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:575)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:174)
    ... 29 more
Caused by: javax.net.ssl.SSLException: java.security.ProviderException: java.security.KeyException
    at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
    at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1916)
    at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1874)
    at sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1857)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1378)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1355)
    at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:553)
    at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:412)
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:179)
    at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:144)
    at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:134)
    at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:612)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:447)
    at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:884)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:326)
    ... 36 more
Caused by: java.security.ProviderException: java.security.KeyException
    at sun.security.ec.ECKeyPairGenerator.generateKeyPair(ECKeyPairGenerator.java:146)
    at java.security.KeyPairGenerator$Delegate.generateKeyPair(KeyPairGenerator.java:704)
    at sun.security.ssl.ECDHCrypt.<init>(ECDHCrypt.java:78)
    at sun.security.ssl.ClientHandshaker.serverKeyExchange(ClientHandshaker.java:717)
    at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:278)
    at sun.security.ssl.Handshaker.processLoop(Handshaker.java:913)
    at sun.security.ssl.Handshaker.process_record(Handshaker.java:849)
    at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1035)
    at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1344)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1371)
    ... 48 more
Caused by: java.security.KeyException
    at sun.security.ec.ECKeyPairGenerator.generateECKeyPair(Native Method)
    at sun.security.ec.ECKeyPairGenerator.generateKeyPair(ECKeyPairGenerator.java:126)
    ... 57 more

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1450)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1438)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1437)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1437)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1659)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1871)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1884)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1897)
    at org.apache.spark.api.python.PythonRDD$.runJob(PythonRDD.scala:441)
    at org.apache.spark.api.python.PythonRDD.runJob(PythonRDD.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:280)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:128)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:211)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.ServiceException: Request Error: java.security.ProviderException: java.security.KeyException
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:478)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:427)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:427)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:427)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:181)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at org.apache.hadoop.fs.s3native.$Proxy8.retrieveMetadata(Unknown Source)
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:468)
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem.open(NativeS3FileSystem.java:611)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)
    at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:108)
    at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
    at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:246)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:209)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:102)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
    at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
    at org.apache.spark.scheduler.Task.run(Task.scala:85)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    ... 1 more
Caused by: org.jets3t.service.ServiceException: Request Error: java.security.ProviderException: java.security.KeyException
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:623)
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:277)
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestHead(RestStorageService.java:1038)
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2250)
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectDetailsImpl(RestStorageService.java:2179)
    at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:1120)
    at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:575)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:174)
    ... 29 more
Caused by: javax.net.ssl.SSLException: java.security.ProviderException: java.security.KeyException
    at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
    at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1916)
    at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1874)
    at sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1857)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1378)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1355)
    at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:553)
    at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:412)
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:179)
    at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:144)
    at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:134)
    at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:612)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:447)
    at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:884)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
    at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:326)
    ... 36 more
Caused by: java.security.ProviderException: java.security.KeyException
    at sun.security.ec.ECKeyPairGenerator.generateKeyPair(ECKeyPairGenerator.java:146)
    at java.security.KeyPairGenerator$Delegate.generateKeyPair(KeyPairGenerator.java:704)
    at sun.security.ssl.ECDHCrypt.<init>(ECDHCrypt.java:78)
    at sun.security.ssl.ClientHandshaker.serverKeyExchange(ClientHandshaker.java:717)
    at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:278)
    at sun.security.ssl.Handshaker.processLoop(Handshaker.java:913)
    at sun.security.ssl.Handshaker.process_record(Handshaker.java:849)
    at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1035)
    at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1344)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1371)
    ... 48 more
Caused by: java.security.KeyException
    at sun.security.ec.ECKeyPairGenerator.generateECKeyPair(Native Method)
    at sun.security.ec.ECKeyPairGenerator.generateKeyPair(ECKeyPairGenerator.java:126)
    ... 57 more

1 个答案:

答案 0 :(得分:0)

按照文档的要求安装java版本1.8,问题就解决了。