Database design with rectangular data

时间:2016-02-12 19:21:02

标签: mysql database database-design

I'm trying to learn SQL and database design and need some help with selecting a good design of my database in this case. I’m using C# and MySQL.
My input data in this lesson consist of energy meters, all with a unique identification number and every meter delivers one value per hour. I have data from 2013 and onward, and this will continue for a non-specified future. Best guess is 5 years ahead. There are roughly 25 000 meters so there will be 25e3 * 24 = 600 000 data points a day. I get this data once a day via file. The number of meters will change in a slow pace, so there will be around 500 changes per year, adding and removing meters. As a bonus I would like to know when the value was added to the database to calculate some performance-index of the collection system. So this is the input data for each meter:

  • Valuetime (datetime)
  • Value (decimal data)
  • Date_added (datetime)

Every meter delivers one type of data so I can store a table with the type of data, so the data itself will consist of anonymous decimal values. This is where my problem begins. I have tried some different design approaches:

  1. One large table with each row consisting of one-hour data, and one column per meter. Failure due to large amount of columns, and I need a separate equally big table with “Date_added”.
  2. One table per meter, columns valuetime, value and date_added. Failure due to slow performance in C#-program.
  3. Partitioned tables (i.e. table1 = meter begins with 1 and so forth). This still leads to many columns.
  4. Partitioned table where table 10 = meter begins with 10 and so forth. This still lead to many columns.

All solutions above leads to quite slow performance when adding data to the database.

If I search Stack Overflow and elsewhere for database design with large number of columns I will always find the answer “Normalize!”, but I do not know how in my case because my novice experience. I have a unique value (valuetime) and I have unique meter ID, this is why I call my data rectangular.

Can someone please lead me to the right path?

2 个答案:

答案 0 :(得分:0)

For your inputted data:

Meter Table:

ID int PK IDENTITY(1, 1)
MeterName varchar

ReadingsTable:

ID int PK IDENTITY(1, 1)
MeterID int FK
Value decimal
TimeStamp datetime
DateAdded date

You should populate this with an ETL - make an SSIS package or something. Definitely better than a C# app, in my opinion.

Next, you can make aggregation tables:

DailyAggTable:

ID int PK IDENTITY(1, 1)
MeterID int FK
SumOfValue decimal
Date date

You can populate this after your ETL. You can make weekly, monthly, quarterly, yearly, etc. agg tables and schedule their population accordingly. This will improve reporting performance.

答案 1 :(得分:0)

以Stan Shaw的答案为基础......

如果数据是CSV文件,请每晚使用LOAD DATA。您应该加载到临时表中,按摩数据,然后复制到真实表中。可能不需要任何C#代码。

DateAdded似乎有些无用,并且使表格变得杂乱无章。完全删除,或构建另一个表来记录上传。

不要打扰主桌上的ID; (MeterID,Timestamp)是'自然'PRIMARY KEY。再次,这节省了空间。

我只会在一个摘要表中构建每日摘要行。该表可能足够快以处理每周/每月查询。只有在速度不够快的情况下,才应考虑摘要摘要。