<asp:UpdatePanel ID="up" runat="server">
<ContentTemplate>
<asp:ListView ID="products" runat="server" OnPagePropertiesChanging="OnPagePropertiesChanging">
<ItemTemplate>
Content Here
</ItemTemplate>
<LayoutTemplate>
<div id="itemPlaceholderContainer" runat="server" style="">
<div runat="server" id="itemPlaceholder" />
</div>
<div class="clearfix"></div>
<div class="datapager">
<asp:DataPager ID="DataPager1" ClientIDMode="Static" runat="server" PageSize="24" PagedControlID="products" ViewStateMode="Enabled">
<Fields>
<asp:NextPreviousPagerField ButtonType="Link" ShowFirstPageButton="false" ShowPreviousPageButton="False" ShowNextPageButton="false" ButtonCssClass="nextPre" />
<asp:NumericPagerField ButtonType="Link" ButtonCount="10" />
<asp:NextPreviousPagerField ButtonType="Link" ShowNextPageButton="true" ShowLastPageButton="false" ShowPreviousPageButton="false" ButtonCssClass="nextPre" />
</Fields>
</asp:DataPager>
<div class="clear"></div>
</div>
</LayoutTemplate>
</asp:ListView>
</ContentTemplate>
</asp:UpdatePanel>
我在Spark中有+---------+---------------------+--------+
| name | datetime | status |
+---------+---------------------+--------+
| object1 | 2016-05-21T05:20:56 | OK |
| object1 | 2016-05-21T05:21:00 | OK |
+---------+---------------------+--------+
以上的示例,如何计算状态为DataSet
的同一对象的时差?
我想在计算时间后返回如下:
OK
答案 0 :(得分:2)
您应该可以使用内置的Spark函数和窗口聚合函数来完成此操作。
val names = Window.partitionBy('name).orderBy('datetime)
val withPreviousDateTime = df
.withColumn("previousTime", lag('datetime, 1) over names)
.withColumn(unix_timestamp('datetime) - unix_timestamp('previousTime))
当然你应该在开头添加:
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions._
import spark.implicits._
如此隐含,Windows和功能将可见
答案 1 :(得分:1)
我的第一个建议是坚持java.sql.Timestamp
次,因为它们本身由Spark SQL支持,因此更容易使用它们。
让我们从一些设置开始:
import java.sql.Timestamp
import org.apache.spark.sql.expressions.Window._
final case class Data(name: String, datetime: Timestamp, status: String)
spark.createDataset(
sc.parallelize(Seq(
Data("object1", Timestamp.valueOf("2016-05-21 05:20:56"), "OK"),
Data("object2", Timestamp.valueOf("2016-05-21 05:20:57"), "OK"),
Data("object3", Timestamp.valueOf("2016-05-21 05:20:58"), "OK"),
Data("object2", Timestamp.valueOf("2016-05-21 05:20:58"), "KO"),
Data("object3", Timestamp.valueOf("2016-05-21 05:20:59"), "OK"),
Data("object1", Timestamp.valueOf("2016-05-21 05:21:00"), "OK")
)
)
现在我们准备好了Dataset
,让我们开始工作:
val result =
ds.
where($"status" === "OK").
withColumn("t", lag('datetime, 1).over(partitionBy($"name").orderBy('datetime))).
withColumn("duration", unix_timestamp($"datetime") - unix_timestamp($"t")).
select($"name", $"duration").where(not($"duration".isNull))
如果你现在result.show()
,你应该看到以下内容:
+-------+--------+
| name|duration|
+-------+--------+
|object1| 4|
|object3| 1|
+-------+--------+
在查询中:
OK
行lag
附加到前一行答案 2 :(得分:0)
选择名称,引导(dt)结束(按名称顺序分区dt)-dt持续时间从t,其中status = OK
它称为wlndow函数或分析函数