Pyspark错误java.lang.IllegalArgumentException

时间:2018-09-16 08:11:36

标签: apache-spark pyspark apache-spark-sql

我正在尝试计算滞后但火花返回

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-31-505483db4b4b> in <module>()
      6         F.col('filled_serie') + 1)\
      7     .otherwise(F.col('filled_serie')))
----> 8 tcheck_up_stages_dataset_bis.show()

~\Anaconda3\lib\site-packages\pyspark\sql\dataframe.py in show(self, n, truncate, vertical)
    348         """
    349         if isinstance(truncate, bool) and truncate:
--> 350             print(self._jdf.showString(n, 20, vertical))
    351         else:
    352             print(self._jdf.showString(n, int(truncate), vertical))

~\Anaconda3\lib\site-packages\py4j\java_gateway.py in __call__(self, *args)
   1255         answer = self.gateway_client.send_command(command)
   1256         return_value = get_return_value(
-> 1257             answer, self.gateway_client, self.target_id, self.name)
   1258 
   1259         for temp_arg in temp_args:

~\Anaconda3\lib\site-packages\pyspark\sql\utils.py in deco(*a, **kw)
     61     def deco(*a, **kw):
     62         try:
---> 63             return f(*a, **kw)
     64         except py4j.protocol.Py4JJavaError as e:
     65             s = e.java_exception.toString()

~\Anaconda3\lib\site-packages\py4j\protocol.py in get_return_value(answer, gateway_client, target_id, name)
    326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
--> 328                     format(target_id, ".", name), value)
    329             else:
    330                 raise Py4JError(

Py4JJavaError: An error occurred while calling o2512.showString.
: java.lang.IllegalArgumentException
    at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
    at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
    at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
    at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:46)
    at 

... .... ....

java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.base/java.lang.reflect.Method.invoke(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:282)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.base/java.lang.Thread.run(Unknown Source)

我的数据集看起来像 售后上校是我创建的一些功能。

    +------------+--------+--------------+------------+------------------+------+---+-----+----+-------+-------+----------+--------+------+------------+-----------+-------+
|      id_s|idstore|date_sale|f_qty_recalc|           prc| sales|day|month|year|weekday|weekend|monthbegin|monthend|season|monthquarter|yearquarter|yearday|
+------------+--------+--------------+------------+------------------+------+---+-----+----+-------+-------+----------+--------+------+------------+-----------+-------+
|    471809.0|     184|    2016-07-30|           2|              1.29|  2.58| 30|    7|2016|      7|      0|         0|       1|Spring|           3|          2|    212|
|    143686.0|     355|    2016-07-30|           1|              2.09|  2.09| 30|    7|2016|      7|      0|         0|       1|Spring|           3|          2|    212|
|    104984.0|     184|    2016-07-30|           2|              2.39|  4.78| 30|    7|2016|      7|      0|         0|       1|Spring|           3|          2|    212|
|    470174.0|     355|    2016-07-30|           1|               1.0|   1.0| 30|    7|2016|      7|      0|         0|       1|Spring|           3|          2|    212|
|    971332.0|     355|    2016-07-30|           1|              1.29|  1.29| 30|    7|2016|      7|      0|         0|       1|Spring|           3|          2|    212|
|    377321.0|     183|    2016-07-30|           1|              0.99|  0.99| 30|    7|2016|      7|      0|         0|       1|Spring|           3|          2|    212|

然后是下面的脚本,该脚本返回上面的回溯

tcheck_up_stages_dataset_bis = Ticket.withColumn('filled_serie', F.lit(0)) #emptycol filled with 0
window = Window.partitionBy(["id_s", "idstore"]).orderBy("date_sale")
for index in reversed(range(0, 7)) : 
    tcheck_up_stages_dataset_bis = tcheck_up_stages_dataset_bis.withColumn('filled_serie',F.when(
        F.isnull(F.lag(F.col("prc"), index).over(window)), 
        F.col('filled_serie') + 1)\
    .otherwise(F.col('filled_serie')))
tcheck_up_stages_dataset_bis.show()

我不知道错误来自何处。 我加入II时也会遇到相同的错误。有人可以向我解释。

0 个答案:

没有答案