使用SQL Server从复杂的XML结构中读取值

时间:2016-08-08 10:25:18

标签: sql sql-server xml tsql xml-parsing

我试图从数据类型ntext的列中读取XML结构中的SQL Server查询中的值。

这是我想从中提取VALUE TO READ!!!的XML结构:

<PrinterProcessDef xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://dev.docuware.com/settings/workflow/processdef" Id="3e62848d-040e-4f4c-a893-ed85a7b2878a" Type="PrinterProcess" ConfigId="c43792ed-1934-454b-a40f-5f4dfec933b0" Enabled="true" PCId="2837f136-028d-47ed-abdc-4103bedce1d2" Timestamp="2016-08-08T09:44:38.532415">
  <Configs>
    <Config xmlns:q1="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q1:PrinterProcessConfig" Id="c43792ed-1934-454b-a40f-5f4dfec933b0" />
    <Config xmlns:q2="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q2:RecognizeActConfig" Id="b89a6fc2-5573-4034-978a-752c6c0de4cf">
      <q2:Header DefaultRecognitionTechnology="OCR" DefaultOCRSettingsGuid="00000000-0000-0000-0000-000000000000">
      </q2:Header>
      <q2:Body>
        <q2:AnchorDefs />
        <q2:ZoneDefs />
        <q2:TableDefs />
        <q2:FaceLayouts>
        </q2:FaceLayouts>
        <q2:FaceSamples>
        </q2:FaceSamples>
        <q2:SampleDocument>
          <MetaData xmlns="http://dev.docuware.com/settings/common" FileName="Test - Editor" MimeType="application/pdf" PageCount="1" SourceAppName="C:\Windows\system32\NOTEPAD.EXE" DocumentTitle="Test - Editor" PdfCreator="DocuWare Printer" />
          <Data xmlns="http://dev.docuware.com/settings/common">!!!VALUE TO READ!!!</Data>
        </q2:SampleDocument>
      </q2:Body>
      <q2:AllPagesRequired>false</q2:AllPagesRequired>
    </Config>
    <Config xmlns:q3="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q3:RecognizeActConfig" Id="db5b195d-79e4-4804-bd38-f4fc7e8d5a8d">
    </Config>
    <Config xmlns:q4="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q4:AddOverlayActConfig" Id="023aab08-c6e3-4f08-9d26-0175d1564ef2">
      <q4:Overlays />
    </Config>
    <Config xmlns:q5="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q5:PrintActConfig" Id="4a4ec06a-8652-4777-84d2-53cb862b3328">
    </Config>
    <Config xmlns:q6="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q6:SignActConfig" Id="8c030961-e68e-4c2f-83f1-cac20f51d4d6">
    </Config>
    <Config xmlns:q7="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q7:EmailActConfig" Id="5dbd144b-5c33-407a-b638-e062f9045fb4">
    </Config>
    <Config xmlns:q8="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q8:IndexActConfig" Id="f2a70e07-d76e-4e82-9313-7c665df4c311">
    </Config>
    <Config xmlns:q10="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q10:StoreActConfig" Id="ff8aec66-608e-4dde-a4b6-de65ada39bb0">
    </Config>
    <Config xmlns:q11="http://dev.docuware.com/settings/workflow/processconfig" xsi:type="q11:NotifyUserActConfig" Id="7ffb0437-6b8c-4f5f-8f40-434f4a6d609a" />
  </Configs>
  <Activities>
  </Activities>
</PrinterProcessDef>

这是我使用的SQL查询:

SELECT 
    CAST([Table].[settings] as xml)
        .value('declare namespace q2="http://dev.docuware.com/settings/workflow/processconfig";
        (/PrinterProcessDef/Configs/Config[@xsi:type="q2:RecognizeActConfig"]/q2:Body/q2:SampleDocument/Data/text())[1]',
        'varchar(max)')
FROM 
    [DB].[dbo].[Table]

我得到的所有内容都是NULL,而不是希望VALUE TO READ!!!

我该怎么做才能使查询正常工作?

我也尝试过不使用名称空间声明的其他版本,但我总是得到NULL。

3 个答案:

答案 0 :(得分:3)

所有元素都定义了名称空间。您需要根据定义声明并指定它们

SELECT CAST([Table].[settings] as xml).value(
   'declare namespace top="http://dev.docuware.com/settings/workflow/processdef";
    declare namespace q2="http://dev.docuware.com/settings/workflow/processconfig";
    declare namespace nd="http://dev.docuware.com/settings/common";
    (/top:PrinterProcessDef/top:Configs/top:Config[@xsi:type="q2:RecognizeActConfig"]/q2:Body/q2:SampleDocument/nd:Data)[1]',  
        'varchar(max)')
FROM [DB].[dbo].[Table]

答案 1 :(得分:1)

您忘记了使用xmlns属性声明的命名空间。看一下下面的例子:

DECLARE @xml xml = 'yourXml'

SELECT @xml.value('
declare namespace q2="http://dev.docuware.com/settings/workflow/processconfig";
declare namespace g="http://dev.docuware.com/settings/workflow/processdef";
declare namespace qd="http://dev.docuware.com/settings/common";
(//g:PrinterProcessDef/g:Configs/g:Config[@xsi:type="q2:RecognizeActConfig"]/q2:Body/q2:SampleDocument/qd:Data/text())[1]',
    'varchar(max)')

答案 2 :(得分:1)

然而,这个XML是生成的,命名空间很奇怪......你有一遍又一遍地声明相同的命名空间......如果我没有弄错,命名空间实际上并不是它应该的方式,因此我会忽略它们:

SELECT 
    CAST([Table].[settings] as xml as xml)
        .value('(/*:PrinterProcessDef/*:Configs/*:Config[@*:type="q2:RecognizeActConfig"]/*:Body/*:SampleDocument/*:Data/text())[1]',
        'varchar(max)')
FROM 
    [DB].[dbo].[Table]

无论如何,我建议您在WITH XMLNAMESPACE而不是.value - 函数内声明名称空间。如果您需要多个值,则可以创建更好的读取查询:

WITH XMLNAMESPACES(DEFAULT 'http://dev.docuware.com/settings/workflow/processdef'
                  ,'http://dev.docuware.com/settings/workflow/processconfig' AS q2
                  ,'http://dev.docuware.com/settings/common' AS nd)
SELECT 
    CAST([Table].[settings] as xml)
        .value('(/PrinterProcessDef/Configs/Config[@xsi:type="q2:RecognizeActConfig"]/q2:Body/q2:SampleDocument/nd:Data)[1]',
        'varchar(max)')

顺便说一句:在其他答案中使用DEFAULT可以避免像top:这样的虚拟命名空间......