我有一个包含原始数据的表,需要将它们更改为统一数据,例如
Rd - Road
St - Street
PL - Place
Dr- Drive
Ave - Avenue
以及找到用于数据清理的方法
答案 0 :(得分:0)
我认为这就是你的意思
spark.sql("select distinct a.prod_code,a.bal,a.v_txn_id from aggregates a join (select distinct v_txn_id,max(timestamp) over(partition by v_txn_id) as temp_timestamp from aggregates) b on a.v_txn_id=b.v_txn_id and a.timestamp=b.temp_timestamp order by a.v_txn_id").show()