AWS Glue中的动态框架到数据框架转换错误

时间:2020-04-22 13:46:40

标签: python dataframe apache-spark pyspark aws-glue

将动态帧转换为数据帧时,出现错误-'TypeError:frame_or_dfc必须为DynamicFrame或DynamicFrameCollection.Got'。

我正在使用的代码是这个

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext

from awsglue.job import Job
## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])

sc = SparkContext.getOrCreate()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "test database", table_name = "webinar_csv", transformation_ctx = "datasource0")

applymapping1 = ApplyMapping.apply(frame = datasource0, mappings = [("webinar_id", "long", "webinar_id", "long"), ("grade", "string", "grade", "string"), ("subject", "string", "subject", "string"), ("title", "string", "title", "string"), ("presenter", "string", "presenter", "string"), ("duration", "long", "duration", "long"), ("start_epoch", "long", "start_epoch", "long"), ("end_epoch", "long", "end_epoch", "long"), ("start_date", "string", "start_date", "string"), ("start_time", "string", "start_time", "string")], transformation_ctx = "applymapping1")

resolvechoiceselected1 = ResolveChoice.apply(frame = applymapping1, choice = "make_struct", transformation_ctx = "resolvechoiceselected1")

resolvechoiceselected1.toDF().createOrReplaceTempView("webinar_metadata")

consolidated_df = spark.sql("""select * from webinar_metadata""")

datasink2 = glueContext.write_dynamic_frame.from_options(frame = consolidated_df, connection_type = "s3", connection_options = {"path": "s3://tnl-tmp/test_folder"}, format = "csv", transformation_ctx = "datasink2")

job.commit()

0 个答案:

没有答案