我在Hive中有下表,需要用第一个非空值填充空值。我尝试了LEAD()函数,但只能修复group-key1。由于不清楚多少个缺失值之后会出现第一个非缺失值,因此无法精确应用LEAD。我可以想到使用key作为窗口的开窗功能,但是再次在group-Key中有子组会丢失和不丢失(例如group-key5)
注意-当var = d1
时出现第一个非缺失值key date_time var transid =================================== 1 2019-11-23 15:02:55 p1 null (populated with 82 using LEAD()) 1 2019-11-23 15:04:06 d1 82 1 2019-11-23 15:04:29 e1 82 1 2019-11-23 15:05:32 ads 82 1 2019-11-23 15:05:35 ads 82 1 2019-11-23 15:05:55 tf 82 2 2019-11-23 13:23:31 p1 null (should be populated with 87) 2 2019-11-23 13:26:02 p1 null (should be populated with 87) 2 2019-11-23 13:29:54 d1 87 2 2019-11-23 13:32:06 e1 87 2 2019-11-23 13:33:21 ads 87 2 2019-11-23 13:33:24 ads 87 2 2019-11-23 13:33:40 ps 87 5 2019-11-24 18:42:13 p1 null (should be populated with 84) 5 2019-11-24 18:45:02 p1 null (should be populated with 84) 5 2019-11-24 18:45:32 p2 null (should be populated with 84) 5 2019-11-24 18:46:39 p2 null (should be populated with 84) 5 2019-11-24 18:47:34 d1 84 5 2019-11-24 18:47:58 d2 84 5 2019-11-24 18:48:56 p1 null (should be populated with 15) 5 2019-11-24 18:49:38 p1 null (should be populated with 15) 5 2019-11-24 18:50:33 d1 15 5 2019-11-24 18:50:53 ads 15