我试图修改日期列。
代码如下:
sample = sample.withColumn('next_date', when(sample.next_date.isNull(), (sample['next_date'] + timedelta(days=1))).otherwise(sample['next_date']))
它给了我以下错误:
AttributeError Traceback (most recent call last)
<ipython-input-127-dd09f90d8a49> in <module>()
6 sample = sample.withColumn('next_date', lead('date').over(windowSpecs))
7
----> 8 sample = sample.withColumn('next_date', when(sample.next_date.isNull(), (sample['next_date'] + timedelta(days=1))).otherwise(sample['next_date']))
9
10 sample = sample.withColumn('snapshot_date', lit(dt.datetime.now().strftime("%d-%m-%Y %H:%M")))
/usr/lib/spark/python/pyspark/sql/column.py in _(self, other)
108 def _(self, other):
109 jc = other._jc if isinstance(other, Column) else other
--> 110 njc = getattr(self._jc, name)(jc)
111 return Column(njc)
112 _.__doc__ = doc
/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py in __call__(self, *args)
802
803 args_command = "".join(
--> 804 [get_command_part(arg, self.pool) for arg in new_args])
805
806 command = proto.CALL_COMMAND_NAME +\
/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py in get_command_part(parameter, python_proxy_pool)
276 command_part += ";" + interface
277 else:
--> 278 command_part = REFERENCE_TYPE + parameter._get_object_id()
279
280 command_part += "\n"
AttributeError: 'datetime.timedelta' object has no attribute '_get_object_id'
我该如何解决这个问题?
提前致谢!
答案 0 :(得分:0)
我知道这已经很老了,但我解决了这个问题:
sample = sample.withColumn('next_date', when(sample.next_date.isNull(), date_add(col('next_date'), 1).otherwise(sample['next_date']))
希望这有助于某人!