编写分区文本文件时很简单。
cursors = mongo_collection.parallel_scan(6)
if __name__ == '__main__':
processes = [multiprocessing.Process(target=process_cursor, args=(cursor,)) for cursor in cursors]
例外 -
dataDF.write.partitionBy("year", "month", "date").mode(SaveMode.Append).text("s3://data/test2/events/")