在pyspark中使用窗口功能时出现错误

时间:2020-04-12 14:27:57

标签: pyspark python-3.7 spark2

我正在尝试运行以下代码

employees = (spark.read.format('csv')
             .option('sep', '\t')
             .schema('''EMP_ID INT,F_NAME STRING,L_NAME STRING,
                        EMAIL STRING,PHONE_NR STRING,HIRE_DATE STRING,
                        JOB_ID STRING,SALARY FLOAT,
                        COMMISSION_PCT STRING,
                        MANAGER_ID STRING,DEP_ID STRING''')
             .load('C:/data/hr_db/employees')
)

spec = Window.partitionBy('DEP_ID')

emp = (employees
         .select('JOB_ID', 'DEP_ID', 'SALARY')
         .withColumn('Total Salary', sum('SALARY').over(spec))
         .orderBy('DEP_ID')
)

emp.show()

并出现以下错误

File "C:\spark-2.4.4-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o60.showString.java.lang.IllegalArgumentException: Unsupported class file major version 56

请问有人可以帮助我解决此错误吗?

0 个答案:

没有答案