我尝试使用pyathenajdbc.connect()连接到Athena。我通过多因素身份验证设置了AWS凭据。当我没有在连接字符串中包含AWS令牌时,我会收到以下错误。
athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)
EROR: pyathenajdbc.error.DatabaseError: The security token included in the request is invalid. (Service: AmazonAthena; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: 0d488c0b-1eed-11e7-bad8-711e54af6b73)
当我在连接字符串中包含AWS令牌时,我收到以下错误 - >
athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, token=AWS_SESSION_TOKEN, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)
ERROR: pyathenajdbc.error.DatabaseError: The security token included in the request is invalid. (Service: AmazonAthena; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: 91751051-1eed-11e7-8347-153dfe3d84a6)
有谁知道这里有什么问题?
这是我的整个代码。
from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import jpype
jvm_path = jpype.getDefaultJVMPath()
_current_credentials = Session().get_credentials()
AWS_KEY_ID = _current_credentials.access_key
AWS_SECRET = _current_credentials.secret_key
AWS_SESSION_TOKEN = _current_credentials.token
REGION = "us-east-2"
#athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)
athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, token=AWS_SESSION_TOKEN, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)
cursor = athena_conn.cursor();
query = 'SELECT * FROM xyz.ABC limit 1;'
cursor.execute(query)
df = as_pandas(cursor)
print(df)
答案 0 :(得分:2)
from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import os
_current_credentials = Session().get_credentials()
os.environ['AWS_ACCESS_KEY_ID'] = _current_credentials.access_key
os.environ['AWS_SECRET_ACCESS_KEY'] = _current_credentials.secret_key
os.environ['AWS_SESSION_TOKEN'] = _current_credentials.token
athena_conn = connect(s3_staging_dir='s3://your-bucket/',
region_name='us-west-2',
aws_credentials_provider_class='com.amazonaws.athena.jdbc.shaded.com.amazonaws.auth.EnvironmentVariableCredentialsProvider')
cursor = athena_conn.cursor();
query = 'SELECT * FROM schema.table_name limit 1;'
cursor.execute(query)
df = as_pandas(cursor)
print(df)
答案 1 :(得分:2)
假设您在〜/ .aws文件夹下有一个配置文件,其中定义了区域,您可以使用Session()。region_name
以下工作正常(无需导入操作系统):
from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import jpype
jvm_path = jpype.getDefaultJVMPath()
_current_credentials = Session().get_credentials()
AWS_KEY_ID = _current_credentials.access_key
AWS_SECRET = _current_credentials.secret_key
REGION = Session().region_name
athena_conn = connect(access_key=AWS_KEY_ID,
secret_key=AWS_SECRET,
s3_staging_dir='path_to_staging_dir',
region_name=REGION)
cursor = athena_conn.cursor();
query = 'SELECT current_date;'
cursor.execute(query)
df = as_pandas(cursor)
print(df)
答案 2 :(得分:0)
这个问题不是直截了当的,但我猜它与你的凭据有关。您应该稍微调查一下:尝试打印您的密钥并验证它们是否有效。
以下是我用来输入凭据的替代方法:
import configparser
aws_config_file = '~/.aws/config'
Config = configparser.ConfigParser()
Config.read(os.path.expanduser(aws_config_file))
access_key_id = Config['default']['aws_access_key_id']
secret_key_id = Config['default']['aws_secret_access_key']
否则,只是为了确保问题与jdbc驱动程序无关,请粘贴以下命令的输出
import pyathenajdbc
print(pyathenajdbc.ATHENA_CONNECTION_STRING)
print(pyathenajdbc.ATHENA_DRIVER_CLASS_NAME)
print(pyathenajdbc.ATHENA_DRIVER_DOWNLOAD_URL)
print(pyathenajdbc.ATHENA_JAR)