SQL跟踪最新记录和更新

时间:2017-07-14 01:49:56

标签: sql hive apache-pig pyspark-sql

我有一个表id_track历史记录,它在不同的时间戳中更新id。我想通过在sql中的迭代搜索来巩固最新的id。我怎么能在SQL中做到这一点?

表: imagesArray 所需的结果表: OLD_ID NEW_ID TIME-STAMP 101 103 1/5/2001 102 108 2/5/2001 103 105 3/5/2001 105 106 4/5/2001 110 111 4/5/2001 108 116 14/5/2001 112 117 4/6/2001 104 118 4/7/2001 111 119 4/8/2001

enter image description here

1 个答案:

答案 0 :(得分:0)

SELECT old_id,          (SELECT MAX(new_id)                 从test01 b           START WITH old_id = a.old_id           CONNECT BY old_id = PRIOR new_id)             NEW_ID,          (选择time_stamp             来自test01 c            WHERE new_id =(SELECT MAX(new_id)                                  从test01 b                            START WITH old_id = a.old_id                            CONNECT BY old_id = PRIOR new_id))             TIME_STAMP     来自test01 a    WHERE old_id NOT IN(SELECT new_id                           来自test01 c) ORDER BY old_id ASC;

其中test01是包含数据的表。我们必须使用START WITH .. CONNECT BY PRIOR