我有一个数据框。我需要在pyspark中添加数组[a,a,b,b,c,c,d,d,]

时间:2019-07-09 02:23:24

标签: pyspark

我有一个数据帧df,我有一个数组arr = [1,1,2,2,3,3,4,4]。我需要将此数组添加到现有数据框df

我的代码如下:

low_limit = 2011 
upper_limit = 2017 
arr = np.repeat(np.arange(low_limit,upper_limit),2) 
df = df.withColumn('arrayYear',F.array(F.lit(arr))).show() 

我收到此错误Py4JJavaError:

An error occurred while calling z:org.apache.spark.sql.functions.lit. : 
java.lang.RuntimeException: Unsupported literal type class java.util.ArrayList [2011, 2011, 2012, 2012, 2013, 2013, 2014, 2014, 2015, 2015, 2016, 2016] at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:80) 

0 个答案:

没有答案