将一列拆分为多列

时间:2018-09-04 17:58:08

标签: sql json sql-server tsql

我正在处理Power Bi审核日志报告文件。该文件包含一列“ AuditDate”,并且其中包含多个列。我需要使用sql将该列拆分为多个列。

该列具有这样的值

AuditDate
------------
"{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType"":0,""UserKey"":""FFFA3DA"",""Workload"":""PowerBI"",""UserId"":""john@abc.com"",""ClientIP"":""9.5.3.26"",""UserAgent"":""Mozilla\/5.0 (Windows NT 10.0;"",""Activity"":""ViewReport"",""ItemName"":""Sales"",""WorkSpaceName"":""TeamITO"",""DatasetName"":""Sales1"",""ReportName"":""Sales1"",""WorkspaceId"":""e8eaa0ca"",""ObjectId"":""Sales1"",""DatasetId"":""4c5d-ad45-eb6546"",""ReportId"":""4cb0-99ad-de41b5160c47"",""IsSuccess"":true,""DatapoolRefreshScheduleType"":""None"",""DatapoolType"":""Undefined""}"

基本上我需要将此列拆分为

id     RecordType      CreationTime    Operaration     OrganizationID  UserType
------------------------------------------------------------------------------
44de2468    20     2018-08-03T12:30:34   ViewReport     779558               0

任何人都可以帮助sql查询吗?

3 个答案:

答案 0 :(得分:2)

这非常简单,您只需要一个字符串“ splitter”(AKA标记器)即可。如果您使用的是SQL 2016+,则可以使用STRING_SPLIT;如果您使用的是2016年之前的系统,则可以在2005+上使用DelimitedSplit8K或在2012+上使用DelimitedSplit8K_LEAD。解决方案如下所示:

DECLARE @AuditDate VARCHAR(8000) = 
'"{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType"":0,""UserKey"":""FFFA3DA"",""Workload"":""PowerBI"",""UserId"":""john@abc.com"",""ClientIP"":""9.5.3.26"",""UserAgent"":""Mozilla\/5.0 (Windows NT 10.0;"",""Activity"":""ViewReport"",""ItemName"":""Sales"",""WorkSpaceName"":""TeamITO"",""DatasetName"":""Sales1"",""ReportName"":""Sales1"",""WorkspaceId"":""e8eaa0ca"",""ObjectId"":""Sales1"",""DatasetId"":""4c5d-ad45-eb6546"",""ReportId"":""4cb0-99ad-de41b5160c47"",""IsSuccess"":true,""DatapoolRefreshScheduleType"":""None"",""DatapoolType"":""Undefined""}"'

SELECT 
  Id             = MAX(CASE split.attrib WHEN 'ID'             THEN split.val END),
  RecordType     = MAX(CASE split.attrib WHEN 'RecordType'     THEN split.val END),
  CreationTime   = MAX(CASE split.attrib WHEN 'CreationTime'   THEN split.val END),
  Operation      = MAX(CASE split.attrib WHEN 'Operation'      THEN split.val END),
  OrganizationId = MAX(CASE split.attrib WHEN 'OrganizationId' THEN split.val END),
  UserType       = MAX(CASE split.attrib WHEN 'UserType'       THEN split.val END)
FROM
(
  SELECT      attrib = REPLACE(REPLACE(SUBSTRING(split.value, 1, mid.point-1),'{',''),'"',''),
              val    = REPLACE(REPLACE(SUBSTRING(split.value, mid.point+1, 8000),'{',''),'"','')
  FROM        STRING_SPLIT(@AuditDate,',') AS split
  CROSS APPLY (VALUES(CHARINDEX(':', split.value))) AS mid(point)
  WHERE       REPLACE(REPLACE(SUBSTRING(split.value, 1, mid.point-1),'{',''),'"','') IN
                ('id','RecordType','CreationTime','Operation','OrganizationID','UserType')
) AS split;

结果:

Id         RecordType  CreationTime          Operation   OrganizationId  UserType
---------- ----------- --------------------- ----------- --------------- ---------
44de2468   20          2018-08-03T12:30:34   ViewReport  779558          0

答案 1 :(得分:1)

您似乎在这里处理格式错误的JSON列。那些双双引号很麻烦。

但是,如果您可以清理格式,则只能在查询中使用JSON函数。

首先,设置数据(使用您在此问题的其他副本中提供的数据,{Split column values into multiple columns):

DECLARE @t TABLE 
(
  RecordType NVARCHAR(20)
  ,AuditDate NVARCHAR(MAX)
);
INSERT @t
  (
    RecordType
    ,AuditDate
  )
VALUES
  ('View', '{""Id"":""44de2468"",""Type"":20,""CreationDate"":""2018-08-23""}')
 ,('Edit', '{""Id"":""44de2467"",""Type"":40,""CreationDate"":""2018-08-24""}')
 ,('Print', '{""Id"":""44de2768"",""Type"":60,""CreationDate"":""2018-05-06""}')
 ,('Delete', '{""Id"":""44de2488"",""Type"":30,""CreationDate"":""2018-07-20""}');

现在,通过将双双引号替换为单双引号来清理格式错误的JSON。

UPDATE @t
SET AuditDate = REPLACE(AuditDate,'""','"');

验证JSON的外观。

SELECT * FROM @t

--Results:
+------------+---------------------------------------------------------+
| RecordType |                        AuditDate                        |
+------------+---------------------------------------------------------+
| View       | {"Id":"44de2468","Type":20,"CreationDate":"2018-08-23"} |
| Edit       | {"Id":"44de2467","Type":40,"CreationDate":"2018-08-24"} |
| Print      | {"Id":"44de2768","Type":60,"CreationDate":"2018-05-06"} |
| Delete     | {"Id":"44de2488","Type":30,"CreationDate":"2018-07-20"} |
+------------+---------------------------------------------------------+

然后使用JSON_VALUE()提取您感兴趣的部分。

SELECT 
    RecordType
  , JSON_VALUE(AuditDate, '$.Id') AS [Id]
  , JSON_VALUE(AuditDate, '$.Type') AS [Type]
  , JSON_VALUE(AuditDate, '$.CreationDate') AS CreationDate
FROM @t

--Results
+------------+----------+------+--------------+
| RecordType |    Id    | Type | CreationDate |
+------------+----------+------+--------------+
| View       | 44de2468 |   20 | 2018-08-23   |
| Edit       | 44de2467 |   40 | 2018-08-24   |
| Print      | 44de2768 |   60 | 2018-05-06   |
| Delete     | 44de2488 |   30 | 2018-07-20   |
+------------+----------+------+--------------+

答案 2 :(得分:1)

对于SQL Server 2016,这非常简单。有相当多的JSON支持。您唯一的问题是,您的字符串不正确。很显然,有一个引擎使所有内部引号加倍(一种转义技术)。

如果这在您的控制之下,则应尝试将列的格式更改为正确的JSON。最好让编写应用程序以正确的JSON格式提供这些审核。至少您可以添加第二列并使用触发器来保持同步。不得已时,您可以使用REPLACE来修复您的字符串:

REPLACE(REPLACE(REPLACE(@YourString,'"{','{'),'}"','}'),'""','"');

由于行很多,可能要花一些时间...这就是为什么最好将格式保留为正确的JSON。

仅展示原理:

DECLARE @YourString NVARCHAR(MAX)=N'"{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType"":0,""UserKey"":""FFFA3DA"",""Workload"":""PowerBI"",""UserId"":""john@abc.com"",""ClientIP"":""9.5.3.26"",""UserAgent"":""Mozilla\/5.0 (Windows NT 10.0;"",""Activity"":""ViewReport"",""ItemName"":""Sales"",""WorkSpaceName"":""TeamITO"",""DatasetName"":""Sales1"",""ReportName"":""Sales1"",""WorkspaceId"":""e8eaa0ca"",""ObjectId"":""Sales1"",""DatasetId"":""4c5d-ad45-eb6546"",""ReportId"":""4cb0-99ad-de41b5160c47"",""IsSuccess"":true,""DatapoolRefreshScheduleType"":""None"",""DatapoolType"":""Undefined""}"';

SET @YourString = REPLACE(REPLACE(REPLACE(@YourString,'"{','{'),'}"','}'),'""','"');

您的字符串现在将如下所示:

{"Id":"44de2468","RecordType":20,"CreationTime":"2018-08-03T12:30:34","Operation":"ViewReport","OrganizationId":"779558","UserType":0,"UserKey":"FFFA3DA","Workload":"PowerBI","UserId":"john@abc.com","ClientIP":"9.5.3.26","UserAgent":"Mozilla\/5.0 (Windows NT 10.0;","Activity":"ViewReport","ItemName":"Sales","WorkSpaceName":"TeamITO","DatasetName":"Sales1","ReportName":"Sales1","WorkspaceId":"e8eaa0ca","ObjectId":"Sales1","DatasetId":"4c5d-ad45-eb6546","ReportId":"4cb0-99ad-de41b5160c47","IsSuccess":true,"DatapoolRefreshScheduleType":"None","DatapoolType":"Undefined"}

此查询将以驱动列表的形式返回所有列:

SELECT * 
FROM OPENJSON(@YourString);

结果返回一个带有类型提示的列表(而“值”的实际类型为nvarchar):

+-----------------------------+-------------------------------+------+
| key                         | value                         | type |
+-----------------------------+-------------------------------+------+
| Id                          | 44de2468                      | 1    |
+-----------------------------+-------------------------------+------+
| RecordType                  | 20                            | 2    |
+-----------------------------+-------------------------------+------+
| CreationTime                | 2018-08-03T12:30:34           | 1    |
+-----------------------------+-------------------------------+------+
| Operation                   | ViewReport                    | 1    |
+-----------------------------+-------------------------------+------+
| OrganizationId              | 779558                        | 1    |
+-----------------------------+-------------------------------+------+
| UserType                    | 0                             | 2    |
+-----------------------------+-------------------------------+------+
| UserKey                     | FFFA3DA                       | 1    |
+-----------------------------+-------------------------------+------+
| Workload                    | PowerBI                       | 1    |
+-----------------------------+-------------------------------+------+
| UserId                      | john@abc.com                  | 1    |
+-----------------------------+-------------------------------+------+
| ClientIP                    | 9.5.3.26                      | 1    |
+-----------------------------+-------------------------------+------+
| UserAgent                   | Mozilla/5.0 (Windows NT 10.0; | 1    |
+-----------------------------+-------------------------------+------+
| Activity                    | ViewReport                    | 1    |
+-----------------------------+-------------------------------+------+
| ItemName                    | Sales                         | 1    |
+-----------------------------+-------------------------------+------+
| WorkSpaceName               | TeamITO                       | 1    |
+-----------------------------+-------------------------------+------+
| DatasetName                 | Sales1                        | 1    |
+-----------------------------+-------------------------------+------+
| ReportName                  | Sales1                        | 1    |
+-----------------------------+-------------------------------+------+
| WorkspaceId                 | e8eaa0ca                      | 1    |
+-----------------------------+-------------------------------+------+
| ObjectId                    | Sales1                        | 1    |
+-----------------------------+-------------------------------+------+
| DatasetId                   | 4c5d-ad45-eb6546              | 1    |
+-----------------------------+-------------------------------+------+
| ReportId                    | 4cb0-99ad-de41b5160c47        | 1    |
+-----------------------------+-------------------------------+------+
| IsSuccess                   | true                          | 3    |
+-----------------------------+-------------------------------+------+
| DatapoolRefreshScheduleType | None                          | 1    |
+-----------------------------+-------------------------------+------+
| DatapoolType                | Undefined                     | 1    |
+-----------------------------+-------------------------------+------+

更好的是,您可以像下面这样添加WITH子句:

SELECT * 
FROM OPENJSON(@YourString)
WITH 
(   
    Id             varchar(200)  '$.Id',  
    RecordType     int           '$.RecordType',  
    CreationTime   datetime      '$.CreationTime'
    --Add all your known columns here...
)

这样做可以让您键入值并并排

+----------+------------+-------------------------+
| Id       | RecordType | CreationTime            |
+----------+------------+-------------------------+
| 44de2468 | 20         | 2018-08-03 12:30:34.000 |
+----------+------------+-------------------------+