我想生成一个具有如下随机数的列:
df=df.withColumn("random_col",random.randint(100000, 1000000))
上面给我一个错误:
AssertionError:col应该是Column
答案 0 :(得分:0)
首先,我将确保您已导入正确的内容...
尝试导入: 从pyspark.sql.functions导入rand
然后尝试执行以下代码:
df1 = df.withColumn(“ random_col”,rand()> 100000,1000000)
You also could check out this resource. It looks like it may be helpful for what you are doing
希望这会有所帮助!