我在解析存储在NVARCHAR(MAX)
类型字段中的XML字符串时遇到问题(我无法更改此字段的类型)。
这是我的表(WorkingHours):
CREATE TABLE WorkingHours(
[ID] [int] NOT NULL PRIMARY KEY,
[CONTENT] [nvarchar](MAX) NOT NULL,
-- ...
);
以下是[CONTENT]
属性的示例:
<?xml version="1.0" encoding="UTF-8"?>
<calendar>
<day number="1" worked_day="no">
<interval number="1" begin_hour="08:30" end_hour="12:00"/>
<interval number="2" begin_hour="13:30" end_hour="17:00"/>
<interval number="3" begin_hour="" end_hour=""/></day>
<day number="2" worked_day="no">
<interval number="1" begin_hour="08:30" end_hour="12:00"/>
<interval number="2" begin_hour="13:30" end_hour="17:00"/>
<interval number="3" begin_hour="" end_hour=""/>
</day>
<day number="3" worked_day="no">
<interval number="1" begin_hour="08:30" end_hour="12:00"/>
<interval number="2" begin_hour="13:30" end_hour="17:00"/>
<interval number="3" begin_hour="" end_hour=""/>
</day>
<day number="4" worked_day="no">
<interval number="1" begin_hour="08:30" end_hour="12:00"/>
<interval number="2" begin_hour="13:30" end_hour="17:00"/>
<interval number="3" begin_hour="" end_hour=""/>
</day>
<day number="5" worked_day="no">
<interval number="1" begin_hour="08:30" end_hour="12:00"/>
<interval number="2" begin_hour="13:30" end_hour="17:00"/>
<interval number="3" begin_hour="" end_hour=""/>
</day>
<day number="6" worked_day="no">
<interval number="1" begin_hour="" end_hour=""/>
<interval number="2" begin_hour="" end_hour=""/>
<interval number="3" begin_hour="" end_hour=""/>
</day>
<day number="7" worked_day="no">
<interval number="1" begin_hour="" end_hour=""/>
<interval number="2" begin_hour="" end_hour=""/>
<interval number="3" begin_hour="" end_hour=""/>
</day>
</calendar>
如您所见,数据编码为 UTF-8 。
现在,我想解析这些数据以创建一些计算:
DECLARE @RawContent [nvarchar](MAX) = (
SELECT wh.[CONTENT]
FROM [WorkingHours] wh
WHERE wh.[ID] = 100);
DECLARE @XMLContent [Xml] = @RawContent; // KO
-- DECLARE @XMLContent [Xml] = CAST(@RawContent AS XML); // KO
-- DECLARE @XMLContent [Xml] = CONVERT(XML, @RawContent); // KO
-- Just a test to query XML data.
SELECT
C.WD.value('@number', 'int') AS DayId
FROM @XMLContent.nodes('/calendar/day') AS C(WD);
我不知道如何将结果(包含UTF-8 XML字符串的nvarchar(max)字段)转换为XML值。 SQL Server返回以下错误:
"Unable to switch encoding"
它指的是CAST行(当我定义@XMLContent变量时)。
有什么想法解决这个问题吗?
答案 0 :(得分:5)
删除处理指令 - 它没有意义且不正确,因为数据已经以UTF-16编码(因为它存储为NVARCHAR
)。如果您无法更改已存在的数据,您将不得不依赖(稍微脆弱)的字符串替换:
CAST(REPLACE(wh.[CONTENT], '<?xml version="1.0" encoding="UTF-8"?>', '') AS XML)
请注意,明确指示编码是UTF-16也可以使用 - 虽然它不会添加任何内容。
答案 1 :(得分:1)
另一种选择是首先转换为VARCHAR
数据类型 - 非Unicode - 然后转换为XML
:
DECLARE @RawContent [nvarchar](MAX) = (
SELECT wh.[CONTENT]
FROM [WorkingHours] wh
WHERE wh.[ID] = 100);
DECLARE @XMLContent XML = CAST(CAST(@RawContent AS VARCHAR(MAX)) AS XML)
-- Just a test to query XML data.
SELECT
C.WD.value('@number', 'int') AS DayId
FROM @XMLContent.nodes('/calendar/day') AS C(WD);