Question

我们正在开发一个需要处理大量测量的项目。对于该项目，系统可能每分钟接收10,000次测量。

数据很简单，看起来像：

 device ID  | measurement_type |  a time stamp |  floating point value

有20-30个measurement_types pr。设备。每5分钟进行一次测量。

写入性能并不重要，但系统可以针对读取进行优化。如果系统是在sql中实现的，则大多数查询的格式为：

select * value 
from measurements 
where 
    device_id = :id and 
    measurements_type = :typeid and 
    start_time between :start and :stop

如何为高性能读取设计这样的系统？

我们的一个想法是创建2个相邻的表，一个存储小时值，一个存储日值。然后实施服务以将5分钟值聚合为小时值和小时数。

除了基于SQL的其他系统，它们对快速读取有意义吗？

Answer 1

鉴于您打算使用基于SQL的系统：

如果您想要快速读取，请务必正确设置索引，以便获得INDEX SEEK而不是INDEX SCAN。

查看您的查询似乎您可能需要device_id，measurements_type和start_type上的索引，value作为该索引的包含列。有关此内容的更多信息：Why is SQL Server not using Index for very similar datetime query?

此外，您必须使用与查询的参数值相同的数据类型，以便索引实际用于您的查询。以SQL Server为例，可以使用SQL Server Management Studio中的“显示实际执行计划”功能来验证。