Question

我想在EC2上针对存储在我的S3存储桶中的数据运行Spark代码。根据{{3}}和Spark EC2 documentation，我必须将我的AWS_ACCESS_KEY_ID和AWS_SECRET_ACCESS_KEY添加到core-site.xml文件中。但是，当我进入主EC2节点时，我会看到几个core-site.xml文件。

$ find . -name core-site.xml
./mapreduce/conf/core-site.xml
./persistent-hdfs/share/hadoop/templates/conf/core-site.xml
./persistent-hdfs/src/packages/templates/conf/core-site.xml
./persistent-hdfs/src/contrib/test/core-site.xml
./persistent-hdfs/src/test/core-site.xml
./persistent-hdfs/src/c++/libhdfs/tests/conf/core-site.xml
./persistent-hdfs/conf/core-site.xml
./ephemeral-hdfs/share/hadoop/templates/conf/core-site.xml
./ephemeral-hdfs/src/packages/templates/conf/core-site.xml
./ephemeral-hdfs/src/contrib/test/core-site.xml
./ephemeral-hdfs/src/test/core-site.xml
./ephemeral-hdfs/src/c++/libhdfs/tests/conf/core-site.xml
./ephemeral-hdfs/conf/core-site.xml
./spark-ec2/templates/root/mapreduce/conf/core-site.xml
./spark-ec2/templates/root/persistent-hdfs/conf/core-site.xml
./spark-ec2/templates/root/ephemeral-hdfs/conf/core-site.xml
./spark-ec2/templates/root/spark/conf/core-site.xml
./spark/conf/core-site.xml

经过一些实验，我确定只有将我的凭据添加到mapreduce / conf / core-site.xml和spark / conf / core-site时才能从Spark访问类似s3n://mcneill-scratch/GR.txt的s3n网址。 XML。

这对我来说似乎不对。它不是Amazon S3 documentation，我在文档中找不到您必须将凭据添加到多个文件中的任何内容。

通过core-site.xml修改多个文件是否正确设置s3凭据？那里有文件可以解释这个吗？

Answer 1

./spark/conf/core-site.xml应该是正确的地方

我将AWS访问密钥添加到哪个core-site.xml？

1 个答案: