我有以下数据框,
public BpmIcountPayment(ApexPages.StandardController sc){
if(String.isBlank(sc.getId()){
System.debug('you screwed up passing the valid acc id');
} else {
acc = (Account) sc.getRecord();
}
}
我可以使用|id |lat |lng |timestamp |
+-----+---------+-----------+-------------------+
|user1|3.1357369|101.6863713|2017-11-06 19:33:16|
|user1|3.1360323|101.6874385|2017-11-06 21:10:25|
|user1|3.1363076|101.6902847|2017-11-07 01:39:07|
|user1|3.1357369|101.6863713|2017-11-07 01:39:07|
|user1|3.1357369|101.6863713|2017-11-07 04:16:30|
|user1|3.1357409|101.6860155|2017-11-07 05:05:03|
|user1|3.1357369|101.6863713|2017-11-07 05:05:03|
|user1|3.1357369|101.6863713|2017-11-07 06:13:07|
|user1|3.1360323|101.6874385|2017-11-07 06:13:07|
+-----+---------+-----------+-------------------+
和window
ID和时间戳找到计数(出现次数),预先计数(前一次计数)和pretsp(上次时间戳)。
partitionBy
您可以在输出数据框下面找到
val specDevicePartiton = Window.partitionBy("id").orderBy("timestamp")
val specDevicePartitonTimeStamp = Window.partitionBy("id", "timestamp").orderBy("timestamp")
val userProfileDF = deviceDF.withColumn("prelatitude", lag(deviceDF("lat"), 1).over(specDevicePartiton))
.withColumn("prelongitude", lag(deviceDF("lng"), 1).over(specDevicePartiton))
.withColumn("pretimestamp", lag(deviceDF("timestamp"), 1).over(specDevicePartiton))
.withColumn("pretsp", when((col("timestamp") === col("pretimestamp")), first(col("pretimestamp"))
.over(specDevicePartitonTimeStamp)).otherwise(col("pretimestamp")))
.withColumn("count", count("timestamp").over(specDevicePartitonTimeStamp))
.withColumn("previousCount", lag(col("count"), 1).over(specDevicePartiton))
.withColumn("precount", when((col("timestamp") === col("pretimestamp")), first(col("previousCount"))
.over(specDevicePartitonTimeStamp)).otherwise(col("previousCount")))
.withColumn("preFirstLat", when((col("precount").>(1)) && (col("count") === 1), first(col("lat")).over(specDevicePartitonPreTimeStamp.rowsBetween(-2, -1))))
.withColumn("preFirstLng", when((col("precount").>(1)) && (col("count") === 1), first(col("lng")).over(specDevicePartitonPreTimeStamp)))
.drop("prelatitude", "prelongitude", "nxtlatitude", "nxtlongitude", "pretimestamp")
我想先找出当前行的第一个和最后一个。预期的输出将是这样的,
|id |lat |lng |timestamp |pretsp |count|precount|preFirstLat|preFirstLng|
+-----+---------+-----------+-------------------+-------------------+-----+--------+-----------+-----------+
|user1|3.1357369|101.6863713|2017-11-06 19:33:16|2017-11-06 18:44:12|1 |1 |null |null |
|user1|3.1360323|101.6874385|2017-11-06 21:10:25|2017-11-06 19:33:16|1 |1 |null |null |
|user1|3.1357369|101.6863713|2017-11-07 01:39:07|2017-11-06 21:10:25|2 |1 |null |null |
|user1|3.1363076|101.6902847|2017-11-07 01:39:07|2017-11-06 21:10:25|2 |1 |null |null |
|user1|3.1357369|101.6863713|2017-11-07 04:16:30|2017-11-07 01:39:07|1 |2 |3.1357369 |101.686727 |
|user1|3.1357369|101.6863713|2017-11-07 05:05:03|2017-11-07 04:16:30|2 |1 |null |null |
|user1|3.1357409|101.6860155|2017-11-07 05:05:03|2017-11-07 04:16:30|2 |1 |null |null |
|user1|3.1360323|101.6874385|2017-11-07 06:13:07|2017-11-07 05:05:03|2 |2 |null |null |
|user1|3.1357369|101.6863713|2017-11-07 06:13:07|2017-11-07 05:05:03|2 |2 |null |null |
+-----+---------+-----------+-------------------+-------------------+-----+--------+-----------+-----------+
逻辑: 从当前行的前一行中查找第一个lat和long值。这里前一行具有相同的时间戳和不同的lat和long值。 示例:检查时间戳= 2017-11-07 04:16:30和2017-11-07 05:05:03以上输出。
我已经尝试通过将precount视为start并将-1视为end来动态地行(start,end),但我知道如何实现这一点。
如果我得到解决方案以找出第一个值,那么我必须做同样的计算最后一个值,那么我认为它对于最后一个值是相同的。
这是一个简单的例子,
|id |lat |lng |timestamp |pretsp |count|precount|preFirstLat|preFirstLng|
+-----+---------+-----------+-------------------+-------------------+-----+--------+-----------+-----------+
|user1|3.1357369|101.6863713|2017-11-06 19:33:16|2017-11-06 18:44:12|1 |null |null |null|
|user1|3.1360323|101.6874385|2017-11-06 21:10:25|2017-11-06 19:33:16|1 |1 |3.1357369 |101.6863713|
|user1|3.1357369|101.6863713|2017-11-07 01:39:07|2017-11-06 21:10:25|2 |1 |3.1360323 |101.6874385|
|user1|3.1363076|101.6902847|2017-11-07 01:39:07|2017-11-06 21:10:25|2 |1 |3.1360323 |101.6874385|
|user1|3.1357369|101.6863713|2017-11-07 04:16:30|2017-11-07 01:39:07|1 |2 |3.1357369 |101.686727 |
|user1|3.1357369|101.6863713|2017-11-07 05:05:03|2017-11-07 04:16:30|2 |1 |3.1357369 |101.6863713|
|user1|3.1357409|101.6860155|2017-11-07 05:05:03|2017-11-07 04:16:30|2 |1 |3.1357369 |101.6863713|
|user1|3.1360323|101.6874385|2017-11-07 06:13:07|2017-11-07 05:05:03|2 |2 |3.1357369 |101.6863713|
|user1|3.1357369|101.6863713|2017-11-07 06:13:07|2017-11-07 05:05:03|2 |2 |3.1357369 |101.6863713|
+-----+---------+-----------+-------------------+-------------------+-----+--------+-----------+-----------+
此处在2017-11-08突出显示的行中,prefristVal 20是2017-11-07首次进入,preLastVal 25是2017-11-07的最后一次进入。
谢谢,