我创建了一个包装类来处理AWS Redshift
次加载,在完成加载后,它返回0,然后不提示继续在终端中工作。
2016-03-04 11:17:56,013 [INFO] - Test load triggered, connecting now to staging_ipeds_k12_faculty_counts.
2016-03-04 11:17:56,013 [INFO] - Accessing ods_ipeds.staging_ipeds_k12_faculty_counts
2016-03-04 11:18:01,653 [INFO] - Load to the development database was a success.
2016-03-04 11:18:01,653 [INFO] - Commencing load operation to staging_ipeds_k12_faculty_counts.
2016-03-04 11:18:01,653 [INFO] - Accessing ods_ipeds.staging_ipeds_k12_faculty_counts
2016-03-04 11:18:05,407 [INFO] - Load to the production database was a success.
Process finished with exit code 0
有没有办法让它不能阻止我的提示?我现在必须重新启动它,这会重置我的环境,变量等......我在IPython
工作。
这是我正在打电话的课程:
class LoadToRedshift(object):
def __init__(self,
schema_name,
table,
manifest_url,
s3_credentials,
redshift_db_credentials,
dev_db_credentials=None,
safe_load=False,
truncate=False):
"""
This class automates the copy of data from an S3 file to a Redshift
database. Most of the methods are static, and can be accessed outside
the class. Run the 'execute' method to run the process.
:param schema_name: The schema name associated with the desired table.
:param table: The Redshift table name.
:param manifest_url: The location of the file on S3.
:param s3_credentials: A dictionary containing the access and
secret access keys. Keys must match the example:
S3_INFO = {
'aws_access_key_id': S3_ACCESS,
'aws_secret_access_key': S3_SECRET,
'region_name': 'us-west-2'
}
:param redshift_db_credentials: A dictionary containing the host, port,
database name, username, and password. Keys must match example:
REDSHIFT_POSTGRES_INFO = {
'host': REDSHIFT_HOST,
'port': REDSHIFT_PORT,
'database': REDSHIFT_DATABASE_DEV,
'user': REDSHIFT_USER,
'password': REDSHIFT_PASS
}
:param safe_load: If True will trigger a test load to a specified
development database during the 'full_load' method. Useful for making
sure the data will correctly load before truncating the production
database.
:param truncate: If 'True', the production table will be truncated
before the copy step.
:return: None
"""
self.schema_name = schema_name
self.table_name = table
self.manifest_url = manifest_url
self.s3_creds = s3_credentials
self.prod_db_creds = redshift_db_credentials
self.dev_db_creds = dev_db_credentials
self.safe_load = safe_load
self.truncate = truncate
def __repr__(self):
return ('Schema Name: {}\nTable: {}\nManifest URL:'
' {}\nS3 Credentials: {}\nDev DB Credentials: {}\nProd DB '
'Credentials: {}\nSafe Load: {}\nTruncate: {}'.format(
self.schema_name, self.table_name, self.manifest_url,
self.s3_creds, self.dev_db_creds, self.prod_db_creds,
self.safe_load, self.truncate
))
@staticmethod
def copy_to_db(database_credentials,
schema_name,
table_name,
manifest_url,
s3_credentials,
truncate=False):
"""
Copies data from a file on S3 to a Redshift table. Data must be
properly formatted and in the right order, etc...
:param database_credentials: A dictionary containing the host, port,
database name, username, and password. Keys must match example:
REDSHIFT_POSTGRES_INFO = {
'host': REDSHIFT_HOST,
'port': REDSHIFT_PORT,
'database': REDSHIFT_DATABASE_DEV,
'user': REDSHIFT_USER,
'password': REDSHIFT_PASS
}
:param schema_name: The Redshift schema name.
:param table_name: The Redshift table name.
:param manifest_url: The location of the file on the S3 server.
:param s3_credentials: A dictionary containing the access and
secret access keys. Keys must match the example:
S3_INFO = {
'aws_access_key_id': S3_ACCESS,
'aws_secret_access_key': S3_SECRET,
'region_name': 'us-west-2'
}
:param truncate: If 'True', will cause the table to be truncated before
the load.
:return: None
"""
s3_access = s3_credentials['aws_access_key_id']
s3_secret = s3_credentials['aws_secret_access_key']
logging.info('Accessing {}'.format(schema_name + '.' + table_name))
try:
with ppg2.connect(**database_credentials) as conn:
cur = conn.cursor()
if truncate:
LoadToRedshift.truncate_table(schema_name, table_name, cur)
load='''
copy {}.{}
from '{}'
credentials 'aws_access_key_id={};aws_secret_access_key={}'
delimiter '|'
gzip
trimblanks
truncatecolumns
acceptinvchars
timeformat 'auto'
dateformat 'auto'
'''.format(
schema_name, table_name, manifest_url, s3_access, s3_secret)
cur.execute(load)
conn.commit()
except ppg2.Error as e:
logging.critical('Error occurred during db load: {}'.format(e))
sys.exit(1)
finally:
conn.close()
@staticmethod
def truncate_table(schema, table, cursor):
"""
Truncates a table given the schema and table names."""
trunc_stmt = '''
truncate table {}.{}
'''.format(schema, table)
cursor.execute(trunc_stmt)
def execute(self):
if self.safe_load:
logging.info('Test load triggered, connecting now to {}.'.format(
self.table_name
))
self.copy_to_db(self.dev_db_creds,
self.schema_name,
self.table_name,
self.manifest_url,
self.s3_creds,
self.truncate)
logging.info('Load to the development database was a success.')
logging.info('Commencing load operation to {}.'.format(
self.table_name))
self.copy_to_db(self.prod_db_creds,
self.schema_name,
self.table_name,
self.manifest_url,
self.s3_creds,
self.truncate)
logging.info('Load to the production database was a success.')
我是怎么称呼的:
l1 = aws.LoadToRedshift('ods_ipeds', 'staging_ipeds_k12_faculty_counts', 's3://' + config3.S3_BUCKET + '/' + 'ipeds/K_12.csv.gz', config3.S3_INFO, config3.REDSHIFT_POSTGRES_INFO_PROD, dev_db_credentials=config3.REDSHIFT_POSTGRES_INFO, safe_load=True, truncate=True)
l1.execute()