我正在尝试在EKS集群上运行Spark作业,使其创建5个工作节点来执行该作业。除了访问应该运行的AWS S3中的 jar可执行文件之外,其他所有操作都正常。这是我的代码。
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: project
namespace: spark
spec:
schedule: "48 15 * * *" #min hour day month week \ will be "00 04 * * *" maybe
jobTemplate:
spec:
template:
spec:
serviceAccountName: spark-service-acc
containers:
- name: project
image: 1234567890.dkr.ecr.us-east-1.amazonaws.com/spark:latest
command: ["/bin/sh",
"-c",
"date && \
cd /opt/spark/jars/ && \
wget http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/3.1.1/hadoop-aws-3.1.1.jar && \
wget http://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-core/1.11.396/aws-java-sdk-core-1.11.396.jar && \
cd ~ && \
YESTERDAY=TZ=GMT+24 date +%Y.%m.%d
&& \
/opt/spark/bin/spark-submit \
--deploy-mode k8s://https://ABCDEF0123456789.y99.us-east-1.eks.amazonaws.com \
--deploy-mode cluster \
--class com.enterprise.project.MainClass \
--conf 'spark.executor.instances=5' \
--conf 'spark.kubernetes.container.image=1234567890.dkr.ecr.us-east-1.amazonaws.com/spark:latest' \
--conf 'spark.kubernetes.namespace=spark' \
--conf 'spark.kubernetes.authenticate.driver.serviceAccountName=spark-service-acc' \
--conf 'spark.executorEnv.ENVIRONMENT=$(ENVIRONMENT)' \
--conf 'spark.executorEnv.AWS_ACCESS_KEY_ID=$(AWS_ACCESS_KEY_ID)' \
--conf 'spark.executorEnv.AWS_SECRET_ACCESS_KEY=$(AWS_SECRET_ACCESS_KEY)' \
--conf 'spark.kubernetes.driverEnv.ENVIRONMENT=$(ENVIRONMENT)' \
--conf 'spark.kubernetes.driverEnv.AWS_ACCESS_KEY_ID=$(AWS_ACCESS_KEY_ID)' \
--conf 'spark.kubernetes.driverEnv.AWS_SECRET_ACCESS_KEY=$(AWS_SECRET_ACCESS_KEY)' \
--conf 'spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem'
--conf 'spark.hadoop.fs.s3a.access.key=$(AWS_ACCESS_KEY_ID)'
--conf 'spark.hadoop.fs.s3a.secret.key=$(AWS_SECRET_ACCESS_KEY)'
--driver-class-path /opt/spark/jars/aws-java-sdk-core-1.11.396.jar
--driver-class-path /opt/spark/jars/hadoop-aws-3.1.1.jar
s3a://abc123/executable-file.jar \
my-executable-file-with-3-params $YESTERDAY s3a://abc123 esserver.abc.com:9200 &&
sleep 600"
]
env: # all fields below need to be passed under a secret "env-vars"
- name: ENVIRONMENT
valueFrom:
secretKeyRef:
name: env-vars
key: environment
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: env-vars
key: aws-access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: env-vars
key: aws-secret-access-key
restartPolicy: Never
错误消息是“在com.enterprise.project.MainClass上找不到主文件”。出现此消息的原因是代码无法访问jar文件。我知道这一点是因为当我将jar文件公开并使用 http 来获取jar文件时,代码可以工作。
奇怪的是,当可执行文件在TZ=GMT+24 date +%Y.%m.%d
中运行时,第二个参数my-executable-file-with-3-params $YESTERDAY s3a://abc123 esserver.abc.com:9200
是从执行写入输出文件的位置。此写操作正常运行(当我使用http访问可执行文件时)。该存储桶与包含可执行文件的存储桶相同,如上所述,该可执行文件无法通过S3A访问。
这种奇怪的行为可能是因为可执行文件中有hadoop jars,可以很好地管理正确的读写。因此,这导致我猜测,Hadoop / aws配置无法正常运行。我可以保证设置为环境变量的访问密钥和秘密密钥以及存储桶名称都是正确的。
现在,如何访问此jar文件?如何解决此问题(可能是hadoop配置问题)?