Question

我的数据集如下：

**SET 1:**
Time = 2017-11-01 13:18:10 
Param1 = 42.42
Param2 = 47.11
Param3 = 12.34
.... (up to 100 parameters)

**SET 2:**
Time = 2017-11-01 13:18:20
Param1 = 45.17
Param2 = 46.11
Param3 = 12.35
.... (up to 100 parameters)

我每10秒钟获得一组新数据。我需要在SQL Server中保存数据（我可以自由定义表）。

后来我需要从数据库中获取数据，这样我就可以在X轴上绘制XY图形，在Y轴上绘制参数。

我在考虑将数据保存为JSON。要么只是表中的字符串（字符串是JSON），要么使用SQL Server 2016中的JSON支持。

建议的方法是什么？我在考虑性能方面做了很多考虑。

我做了一些测试：

简单字符串：这仅供参考。我的数据列只包含一个5位数的字符串。

XML String具有属性：数据列是一个字符串（nvarchar（MAX）），包含带有35个节点的XML，如下所示：

<data>
    <Regtime value='2017 - 08 - 21 13:56:05'/>
    <MachineId value = 'Somefactory.SomeSite.DeviceId' />    
    <Values>
        <B_T_SP value = '181.23' unit = '1234' />
        <B_H_SP_Tdp value = '87.34' unit = '801' />
        <B_A_SP_v_air value = '42.42' unit = '500' />
        <S_T_SP value = '175' unit = '801' />
        <S_A_SP_v_air value = '57.23' unit = '500' 
        ...

带节点的XML字符串：与上述相同但不使用属性：

<data>
    <Regtime>'2017-11-01T12:59:02.2792518+01:00'</Regtime>
    <MachineId>'Somefactory.SomeSite.DeviceId'</MachineId>   
    <Values>
        <B_T_SP> 
            <value>666,50</value>
            <unit>801</unit>
        </B_T_SP>
        <B_H_SP_Tdp> 
            <value>414,21</value>
            <unit>801</unit>
        </B_H_SP_Tdp>
        <B_A_SP_v_air> 
            <value>41,83</value>
            <unit>801</unit>
        </B_A_SP_v_air>
        <S_T_SP> 
            <value>20,70</value>
            <unit>801</unit>
        </S_T_SP>                       
        ...

JSON字符串：数据列是一个字符串（nvarchar（MAX）），包含带有35个节点的JSON，如下所示：

{
              "data": {

                "Regtime": "2017-11-02T12:57:00.3745960+01:00",
                "MachineId": "Somefactory.SomeSite.DeviceId",
                "Values": {
                    "B_T_SP": {
                    "value": "703,81",
                    "unit": "801"

                  },
                  "B_H_SP_Tdp": {
                                "value": "485,90",
                    "unit": "801"

                  },
                  "B_A_SP_v_air": {
                                "value": "3,65",
                    "unit": "801"

                  },
                  "S_T_SP": {
                                "value": "130,44",
                    "unit": "801"

                  },    
                ...

分布式：与CodeCaster建议一样，使用两个表。每套35套SetParameters

插入数据时，我喜欢这样：

startTime = DateTime.Now;
            using (ConnectionScope cs = new ConnectionScope())
            {
                for (int i = 0; i < counts; i++)
                {
                    sql = GetSqlAddData(DataType.XmlAttributeString);
                    using (IDbCommand c = cs.CreateCommand(sql))
                    {
                        c.ExecuteScalar();
                    }
                }
            }
            logText.AppendText(string.Format("{0}x XML attribute string  Insert took \t{1}\r\n", counts, DateTime.Now.Subtract(startTime)));

除了Distribted，在这里我喜欢这个：

using (ConnectionScope cs = new ConnectionScope())
            {
                for (int i = 0; i < counts; i++)
                {
                    sql = GetSqlAddData(DataType.Distributed);
                    using (IDbCommand c = cs.CreateCommand(sql))
                    {
                        id = (int)c.ExecuteScalar();
                    }

                    for (int j = 0; j < 35; j++)
                    {
                        using (IDbCommand c = cs.CreateCommand($"INSERT into test_datalog_distr_detail (setid, name, value) VALUES ({id}, '{"param"+j}', '{j*100 + i}')"))
                        {
                            c.ExecuteScalar();
                        }
                    }

                }
            }
            logText.AppendText(string.Format("{0}x Distributed Insert \t{1}\r\n", counts, DateTime.Now.Subtract(startTime)));

当我读取数据时，我喜欢这样：简单字符串：

var data = new List<Tuple<DateTime, string>>();
            DateTime time;
            string point;
string sql = GetSqlGetData(DataType.SimpleString);
            var startTime = DateTime.Now;
            using (ConnectionScope cs = new ConnectionScope())
            {
                using (IDbCommand cmd = cs.CreateCommand(sql))
                {
                    using (IDataReader reader = cmd.ExecuteReader())
                    {
                        while (reader.Read())
                        {
                            time = DateTime.Parse(reader[5].ToString());
                            point = reader[12].ToString();
                            data.Add(new Tuple<DateTime, string>(time, point));

                        }
                    }
                }
            }
            logText.AppendText(string.Format("{0}x Simple Select {1}\r\n", counts, DateTime.Now.Subtract(startTime)));

带有和不带节点的XML：

sql = GetSqlGetData(DataType.XmlAttributeString);
            startTime = DateTime.Now;
            var doc = new XmlDocument();
            using (ConnectionScope cs = new ConnectionScope())
            {
                using (IDbCommand cmd = cs.CreateCommand(sql))
                {
                    using (IDataReader reader = cmd.ExecuteReader())
                    {
                        while (reader.Read())
                        {
                            time = DateTime.Parse(reader[5].ToString());
                            doc = new XmlDocument();
                            doc.LoadXml(reader[12].ToString());
                            point = doc.SelectSingleNode("/data/Values/B_T_SP").Attributes["value"].Value;
                            data.Add(new Tuple<DateTime, string>(time, point));
                        }
                    }
                }
            }
            logText.AppendText(string.Format("{0}x Select using XmlDoc and Attribute String {1}\r\n", counts, DateTime.Now.Subtract(startTime)));

JSON：

JObject jobj;
            using (ConnectionScope cs = new ConnectionScope())
            {
                using (IDbCommand cmd = cs.CreateCommand(sql))
                {
                    using (IDataReader reader = cmd.ExecuteReader())
                    {
                        while (reader.Read())
                        {
                            time = DateTime.Parse(reader[5].ToString());
                            jobj = JObject.Parse(reader[12].ToString());
                            point = jobj["data"]["Values"]["B_T_SP"].ToString();
                            data.Add(new Tuple<DateTime, string>(time, point));
                        }
                    }
                }
            }
            logText.AppendText(string.Format("{0}x Select using JSON String {1}\r\n", counts, DateTime.Now.Subtract(startTime)));

分布式表格（CodeCaster推荐）

sql = GetSqlGetData(DataType.Distributed);
            startTime = DateTime.Now;
            using (ConnectionScope cs = new ConnectionScope())
            {
                using (IDbCommand cmd = cs.CreateCommand(sql))
                {
                    using (IDataReader reader = cmd.ExecuteReader())
                    {
                        while (reader.Read())
                        {
                            time = DateTime.Parse(reader[5].ToString());
                            point = reader[15].ToString();
                            data.Add(new Tuple<DateTime, string>(time, point));
                        }
                    }
                }
            }
            logText.AppendText(string.Format("{0}x Select on distributed tables {1}\r\n", counts, DateTime.Now.Subtract(startTime)));

我做了一个测试运行，在那里我向数据库插入了100000行并测量了使用的时间：

Simple string value:      18 seconds
XML string (attributes):  36 seconds
XML string (nodes only):  38 seconds
JSON string:              37 seconds
Distributed (CodeCaster): 8 MINUTES!

读取100000行并在每个行中获取一个值：

Simple string value:      0.4 seconds
XML string (attributes):  5.8 seconds
XML string (nodes only):  7.4 seconds
JSON string:              9.4 seconds
Distributed (CodeCaster): 0.5 seconds

到目前为止，我的结论是：我很惊讶XML似乎比JSON更快。我期望分布式选择比XML更快，特别是因为选择一个参数是在SQL中完成的，而不是像JSON和XML那样。但插入分布式表格让我担心。我需要更多测试的是在数据库中使用XML和XML而不是字符串，以便我可以选择在SQL中使用哪个参数而不是之后在XmlDocument

中

Answer 1

我们不要将JSON存储在数据库列中，以免您想要创建内部平台效果或数据库中的数据库。

特别是如果您想要有意义地查询其中存储的数据。当然，有数据库系统支持这一点，但如果您能够事先检查和转换JSON，那么你肯定应该选择这个选项。

简单地将params规范化为一个名为SetParameters的联结表，并使用Sets表的外键。

所以你最终得到两张桌子：

id(A.test)

编号
时间

Sets

编号
SETID
名称
值

Answer 2

只想告知我最终要做的事情：

JSON只是占用了太多的数据库空间，因此我使用了三级数据库结构，在该结构中，我将参数名称保存在一个表中，并将参数值保存在另一个表中，以减少占用空间。

然后我使用表值参数（TVP）来有效地插入数据。

使用数据库中的许多属性保存JSON数据

2 个答案: