困境

Question

困境

我有一个大型数据集，我需要在其中的一部分执行复杂的计算。对于我的计算，我需要根据输入参数从大型集合中获取大量有序数据。

我的方法签名如下所示：

double Process(Entity e, DateTimeOffset? start, DateTimeOffset? end)

潜在解决方案

我想到以下两种方法：

方法1 - WHERE子句

double result = 0d;
IEnumerable<Quote> items = from item in e.Items
                           where (!start.HasValue || item.Date >= start.Value)
                              && (!end.HasValue || item.Date <= end.Value)
                           orderby item.Date ascending
                           select item;
...
return result;

方法2 - Skip＆amp;取

double result = 0d;
IEnumerable<Item> items = e.Items.OrderBy(i => i.Date);
if (start.HasValue)
    items = items.SkipWhile(i => i.Date < start.Value);
if (end.HasValue)
    items = items.TakeWhile(i => i.Date <= end.Value);
...
return result;

问题

如果我只是把它放在一起，我可能只是使用方法1 ，但我的数据集的大小和数据集的大小都太大了忽略轻微的效率损失，并且至关重要的是得到的可枚举数量。

哪种方法会生成更高效的查询？还有一种我还没有考虑过的更有效的方法吗？

所提出的任何解决方案都可以安全地假设表格已被很好地编入索引。

Answer 1

根据link，你不能使用 SkipWhile 而不实现查询，所以在2. case中你实现所有实体，然后计算结果。

在1.场景中，您可以让sql处理此查询并仅实现必要的记录，因此这是更好的选择。

编辑：

我写了样本数据，对数据库进行了查询：

SELECT [Project1].[Id] AS [Id], [Project1].[AddedDate] AS [AddedDate], [Project1].[SendDate] AS [SendDate] FROM ( SELECT [Extent1].[Id] AS [Id], [Extent1].[AddedDate] AS [AddedDate], [Extent1].[SendDate] AS [SendDate] FROM [dbo].[Alerts] AS [Extent1] WHERE ([Extent1].[AddedDate] >= @p__linq__0) AND ([Extent1].[AddedDate] <= @p__linq__1) ) AS [Project1] ORDER BY [Project1].[AddedDate] ASC
SELECT [Extent1].[Id] AS [Id], [Extent1].[AddedDate] AS [AddedDate], [Extent1].[SendDate] AS [SendDate] FROM [dbo].[Alerts] AS [Extent1] ORDER BY [Extent1].[AddedDate] ASC

我插入了1 000 000条记录，并在结果中写入了预期的1行查询。在1个案例中，查询时间为291毫秒，即时实现。在第二种情况下，查询时间是1065毫秒，我不得不等待大约10秒来实现结果;

Answer 2

转换为SQL不支持

SkipWhile。你需要抛弃那个选项。

解决此问题的最佳方法是在用于范围选择的字段上创建索引，然后发出SARGable查询。 where date >= start && date < end是SARGable，可以使用索引。

!start.HasValue ||不是一个好主意，因为这会破坏SARGability。构建查询以便不需要这样做。例如：

if(start != null) query = query.Where(...);

使索引覆盖并获得最佳性能。没有一个额外的行需要处理。

使用EntityFramework获取有序范围的最有效方法是什么？

困境

潜在解决方案

方法1 - WHERE子句

方法2 - Skip＆amp;取

问题

2 个答案: