我有一个包含date_trans,time_trans,price列的表。在选择查询之后,我想添加一个新列“Count”,它将被计算为price列的连续相等值,并且将从最终结果中删除具有连续相等价格的先前行。查看预期的输出:
date_trans time_trans price **Count**
2011-02-22 09:39:59 58.02 1
2011-02-22 09:40:03 58.1 *ROW WILL BE REMOVED
2011-02-22 09:40:07 58.1 *ROW WILL BE REMOVED
2011-02-22 09:40:08 58.1 3
2011-02-22 09:40:10 58.15 1
2011-02-22 09:40:10 58.1 *ROW WILL BE REMOVED
2011-02-22 09:40:14 58.1 2
2011-02-22 09:40:24 58.15 1
2011-02-22 09:40:24 58.18 *ROW WILL BE REMOVED
2011-02-22 09:40:24 58.18 *ROW WILL BE REMOVED
2011-02-22 09:40:24 58.18 3
2011-02-22 09:40:24 58.15 1
请建议从表中选择SQL查询或LINQ表达式
目前,我可以选择查询并循环浏览所有选定的行但是在选择数百万行时需要几个小时。
我目前的代码:
string query = @"SELECT date_trans, time_trans, price
FROM tbl_data
WHERE date_trans BETWEEN '2011-02-22' AND '2011-10-21'
AND time_trans BETWEEN '09:30:00' AND '16:00:00'";
DataTable dt = oUtil.GetDataTable(query);
DataColumn col = new DataColumn("Count", typeof(int));
dt.Columns.Add(col);
int priceCount = 1;
for (int count = 0; count < dt.Rows.Count; count++)
{
double price = Convert.ToDouble(dt.Rows[count]["price"]);
double priceNext = (count == dt.Rows.Count - 1) ? 0 : Convert.ToDouble(dt.Rows[count + 1]["price"]);
if (price == priceNext)
{
priceCount++;
dt.Rows.RemoveAt(count);
count--;
}
else
{
dt.Rows[count]["Count"] = priceCount;
priceCount = 1;
}
}
答案 0 :(得分:2)
这是一个有趣的。我认为你需要的是这样的:
SELECT MAX(date_trans), MAX(time_trans), MAX(price), COUNT(*)
FROM
(SELECT *, ROW_NUMBER() OVER(PARTITION BY price ORDER BY date_trans, time_trans) - ROW_NUMBER() OVER(ORDER BY date_trans, time_trans) AS grp
FROM transactions) grps
GROUP BY grp
<强>更新强>
分组列还需要包含“价格”,否则组可能不是唯一的。还有一件事是,日期和时间列应该组合成一个日期时间列,以便最大日期时间值在一天结束时开始并在下一天结束时结束的组中是正确的。 这是更正的查询。
SELECT MAX(CAST(date_trans AS DATETIME) + CAST(time_trans AS DATETIME)) , MAX(price), COUNT(*)
FROM
(SELECT *,
CAST(ROW_NUMBER() OVER(PARTITION BY price ORDER BY date_trans, time_trans) - ROW_NUMBER() OVER(ORDER BY date_trans, time_trans) AS NVARCHAR(255)) + '-' + CAST(price AS NVARCHAR(255)) AS grp
FROM transactions
ORDER BY date_trans, time_trans) grps
GROUP BY grp
使用'grp'列作为字节数组或bigint而不是nvarchar,查询可能更合理。您还提到了一个“卷”列,您可能希望在该组中求和。