尝试从乱码的XML数据中获取不同数量的值

时间:2018-07-09 19:23:40

标签: sql sql-server tsql

我在一个表中有一个“ rawresponse”列,其XML输出非常混乱。我正在尝试从输出中拉出值,这些值中的值始终始终带有相同的文本字符串,但这些值可以出现多次。

例如,假设我要获取strengthValue的值,我可能有一条记录,“ rawresponse”列将显示为:

  

{“ header”:{“ to”:{“ qualifier”:“ ZZZ”,“ text”:“ P00000000022805”}}“ strength”:{“ strengthValue”:“ 80.0”,“ strengthForm”:{“ code “:” package“}}},”数量“:{” value“:” 3.0“}

然后另一个可能会显示:

  

{“ header”:{“ to”:{“ qualifier”:“ ZZZ”}“ strength”:{“ strengthValue”:“ 80.0”,“ strengthForm”:{“ code”:“ package”}}} ,“ quantity”:{“ value”:“ 3.0”}“ strength”:{“ strengthValue”:“ 1.46”,“ strengthForm”:{“ code”:“ package”}}}},“ strength”:{“ strengthValue “:” 245.0“,” strengthForm“:{” code“:” package“}}}},” quantity“:{” value“:” 3.0“}” strength“:{” strengthValue“:” 80.0“}

因此,在第一个示例中,强度值出现一次,并且该值在小数点前有两位,在小数点后有一位。在第二个中,它发生四次(每次之间不相同-完全是非标准的),并且小数点前后的位数都不同。

我试图用patindex和substring在这里找到的另一个解决方案来解决这个问题,但是无法正常工作。

我真正想要的输出(CTE或最终我需要做不同的事情,例如获取最大值,或计算某些值出现的次数)将是RecordID(另一列),每个值用于该recordID的strengthValue,例如

RecordID    Value
2-AAf-9     22.4
23-T-00     1.4
23-T-00     80.0
23-T-00     146.98
23-T-00     22.001

建议?

2 个答案:

答案 0 :(得分:0)

如果对TVF开放

由于厌倦了提取字符串(左,右,charindex等),我修改了一个parse / split函数来接受两个非相似的delimerer。就您而言,'strengthValue":"''"'

不太清楚所需的结果如何同步样本数据。

示例,或在dbFiddle

上查看
Declare @YourTable table (RecordID varchar(50),rawresponse varchar(max))
Insert Into @YourTable values
 ('2-AAf-9','{"header":{"to":{"qualifier":"ZZZ","text":"P00000000022805"}"strength":{"strengthValue":"80.0","strengthForm":{"code":"package"}}},"quantity":{"value":"3.0"}')
,('23-T-00','{"header":{"to":{"qualifier":"ZZZ"}"strength":{"strengthValue":"80.0","strengthForm":{"code":"package"}}},"quantity":{"value":"3.0"}"strength":{"strengthValue":"1.46","strengthForm":{"code":"package"}}},"strength":{"strengthValue":"245.0","strengthForm":{"code":"package"}}},"quantity":{"value":"3.0"}"strength":{"strengthValue":"80.0"}')

Select A.RecordID
      ,Value = try_convert(decimal(10,2),B.RetVal)
 From  @YourTable A
 Cross Apply [dbo].[tvf-Str-Extract](rawresponse,'strengthValue":"','"') B
 Where try_convert(money,B.RetVal) is not null

返回

RecordID    Value
2-AAf-9     80.00
23-T-00     80.00
23-T-00     1.46
23-T-00     245.00
23-T-00     80.00

TVF,如果有兴趣。

CREATE FUNCTION [dbo].[tvf-Str-Extract] (@String varchar(max),@Delimiter1 varchar(100),@Delimiter2 varchar(100))
Returns Table 
As
Return (  

with   cte1(N)   As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
       cte2(N)   As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
       cte3(N)   As (Select 1 Union All Select t.N+DataLength(@Delimiter1) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter1)) = @Delimiter1),
       cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter1,@String,s.N),0)-S.N,8000) From cte3 S)

Select RetSeq = Row_Number() over (Order By N)
      ,RetPos = N
      ,RetVal = left(RetVal,charindex(@Delimiter2,RetVal)-1) 
 From  (
        Select *,RetVal = Substring(@String, N, L) 
         From  cte4
       ) A
 Where charindex(@Delimiter2,RetVal)>1

)
/*
Max Length of String 1MM characters

Declare @String varchar(max) = 'Dear [[FirstName]] [[LastName]], ...'
Select * From [dbo].[tvf-Str-Extract] (@String,'[[',']]')
*/

BTW -我是删除有关JSON而非XML的第一条评论的人

答案 1 :(得分:0)

这是使用NGrams8K的解决方案。

-- sample data
DECLARE @yourTable TABLE (recordID VARCHAR(20), rawresponse VARCHAR(8000));
INSERT @yourTable(recordID, rawresponse)
VALUES
('2-AAf-9','{"header":{"to":{"qualifier":"ZZZ","text":"P00000000022805"}"strength":{"strengthValue":"80.0","strengthForm":{"code":"package"}}},"quantity":{"value":"3.0"}'),
('23-T-00','{"header":{"to":{"qualifier":"ZZZ"}"strength":{"strengthValue":"80.0","strengthForm":{"code":"package"}}},"quantity":{"value":"3.0"}"strength":{"strengthValue":"1.46","strengthForm":{"code":"package"}}},"strength":{"strengthValue":"245.0","strengthForm":{"code":"package"}}},"quantity":{"value":"3.0"}"strength":{"strengthValue":"80.0"}');

-- solution
SELECT 
  t.recordID, 
  strengthValue = SUBSTRING(string.part,1,CHARINDEX('"', string.part)-1)
FROM @yourTable t
CROSS JOIN (VALUES ('"strengthValue":"')) f(searchTxt)
CROSS APPLY dbo.ngrams8k(t.rawresponse, LEN(f.searchTxt)) ng
CROSS APPLY (VALUES (SUBSTRING(t.rawresponse, ng.position+LEN(f.searchTxt),30))) string(part)
WHERE ng.token = f.searchTxt;

结果:

recordID             strengthValue
-------------------- ------------------------------
2-AAf-9              80.0
23-T-00              80.0
23-T-00              1.46
23-T-00              245.0
23-T-00              80.0