我有一个基于时间戳的连续df,当距离值大于1000时我想进行分割。
df看起来像这样:
+-----------------+-------------------+---+
|timestamp |distance |id |
+-----------------+-------------------+---+
|1541712752000 |1.1990470282994594 |123|
|1541713551000 |1.5804709872862326 |123|
|1541714462000 |0.0 |123|
|1541715475000 |0.53107795768697 |123|
|1541716383000 |0.53107795768697 |123|
|1541716792000 |0.24740321078091282|123|
|1541717695000 |1542.00 |123|
|1541717801000 |2.7767418047706816 |123|
|1541718779000 |13.058715260118664 |123|
|1541719672000 |22.64146251404579 |123|
|1541720581000 |23.861007122654314 |123|
|1541721502000 |16.327504368653443 |123|
|1541722572000 |26.084599108380274 |123|
|1541723500000 |20.630034360787512 |123|
|1541724219000 |1893.00 |123|
|1541725264000 |23.16455204686255 |123|
|1541726037000 |15.911555304774817 |123|
|1541726950000 |20.057274313740784 |123|
|1541727884000 |12.967418789242549 |123|
|1541728085000 |2.720850595301784 |123|
+-----------------+-------------------+---+
基于距离大于1000的连续分割df,我希望具有三个如下所示的df:
+-----------------+-------------------+---+
|timestamp |distance |id |
+-----------------+-------------------+---+
|1541712752000 |1.1990470282994594 |123|
|1541713551000 |1.5804709872862326 |123|
|1541714462000 |0.0 |123|
|1541715475000 |0.53107795768697 |123|
|1541716383000 |0.53107795768697 |123|
|1541716792000 |0.24740321078091282|123|
+-----------------+-------------------+---+
+-----------------+-------------------+---+
|timestamp |distance |id |
+-----------------+-------------------+---+
|1541717695000 |1542.00 |123|
|1541717801000 |2.7767418047706816 |123|
|1541718779000 |13.058715260118664 |123|
|1541719672000 |22.64146251404579 |123|
|1541720581000 |23.861007122654314 |123|
|1541721502000 |16.327504368653443 |123|
|1541722572000 |26.084599108380274 |123|
|1541723500000 |20.630034360787512 |123|
+-----------------+-------------------+---+
+-----------------+-------------------+---+
|timestamp |distance |id |
+-----------------+-------------------+---+
|1541724219000 |1893.00 |123|
|1541725264000 |23.16455204686255 |123|
|1541726037000 |15.911555304774817 |123|
|1541726950000 |20.057274313740784 |123|
|1541727884000 |12.967418789242549 |123|
|1541728085000 |2.720850595301784 |123|
+-----------------+-------------------+---+
我正在使用Spark 2.0.0
谢谢