答案 0 :(得分:3)
在U-SQL中实现此目的的一种方法是使用按日期分区的ROW_NUMBER
排名函数,并按计数降序排序。例如:
@input = SELECT *
FROM (
VALUES
( "2016-01-01T00:00:00", "System.ArgumentNullException", 7 ),
( "2016-01-01T00:00:00", "System.IO.EndOfStreamException", 5 ),
( "2016-01-01T00:00:00", "System.IO.FileNotFoundException", 4 ),
( "2016-01-01T00:00:00", "System.IndexOutofRangeException", 4 ),
( "2016-01-01T00:00:00", "System.ArgumentException", 3 ),
( "2016-01-02T00:00:00", "System.BadImageFormatException", 18 ),
( "2016-01-02T00:00:00", "System.IO.EndOfStreamException", 16 ),
( "2016-01-02T00:00:00", "System.NotImplementedException", 14 ),
( "2016-01-02T00:00:00", "System.UnauthorizedAccessException", 13 ),
( "2016-01-02T00:00:00", "System.ArgumentException", 12 ),
( "2016-01-02T00:00:00", "System.IndexOutofRangeException", 5 ),
( "2016-01-03T00:00:00", "System.IO.EndOfStreamException", 45 ),
( "2016-01-03T00:00:00", "System.FormatException", 42 ),
( "2016-01-03T00:00:00", "System.BadImageFormatException", 41 ),
( "2016-01-03T00:00:00", "System.IndexOutofRangeException", 41 ),
( "2016-01-03T00:00:00", "System.IO.FileNotFoundException", 40 )
) AS x(date, exception, count);
// Add row number to resultset based on date and count descending
@working =
SELECT ROW_NUMBER() OVER(PARTITION BY date ORDER BY count DESC) AS rn,
*
FROM @input;
// Top 3 by date?
@output =
SELECT *
FROM @working
WHERE rn <= 3;
OUTPUT @output TO "/output/output.csv"
USING Outputters.Csv();
我的结果: