AttributeError:' datetime.timedelta'对象没有属性' _get_object_id' :pyspark

时间:2017-03-24 20:51:47

标签: datetime pyspark attributeerror timedelta

我试图修改日期列。

代码如下:

sample = sample.withColumn('next_date', when(sample.next_date.isNull(), (sample['next_date'] + timedelta(days=1))).otherwise(sample['next_date']))

它给了我以下错误:

    AttributeError                            Traceback (most recent call last)
<ipython-input-127-dd09f90d8a49> in <module>()
      6 sample = sample.withColumn('next_date', lead('date').over(windowSpecs))
      7 
----> 8 sample = sample.withColumn('next_date', when(sample.next_date.isNull(), (sample['next_date'] + timedelta(days=1))).otherwise(sample['next_date']))
      9 
     10 sample = sample.withColumn('snapshot_date', lit(dt.datetime.now().strftime("%d-%m-%Y %H:%M")))

/usr/lib/spark/python/pyspark/sql/column.py in _(self, other)
    108     def _(self, other):
    109         jc = other._jc if isinstance(other, Column) else other
--> 110         njc = getattr(self._jc, name)(jc)
    111         return Column(njc)
    112     _.__doc__ = doc

/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py in __call__(self, *args)
    802 
    803         args_command = "".join(
--> 804             [get_command_part(arg, self.pool) for arg in new_args])
    805 
    806         command = proto.CALL_COMMAND_NAME +\

/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py in get_command_part(parameter, python_proxy_pool)
    276             command_part += ";" + interface
    277     else:
--> 278         command_part = REFERENCE_TYPE + parameter._get_object_id()
    279 
    280     command_part += "\n"

AttributeError: 'datetime.timedelta' object has no attribute '_get_object_id'

我该如何解决这个问题?

提前致谢!

1 个答案:

答案 0 :(得分:0)

我知道这已经很老了,但我解决了这个问题:

sample = sample.withColumn('next_date', when(sample.next_date.isNull(), date_add(col('next_date'), 1).otherwise(sample['next_date']))

希望这有助于某人!