如何从存储过程中的固定宽度文件中提取列

时间:2017-04-28 15:10:07

标签: sql tsql sql-server-2012

我有文字格式的报告。我必须将此文本文件加载到表中的一列中,然后编写存储过程以将文本文件拆分为列。下面是我的文本文件的示例。

XXX XXXXXXX                                                                                                       XXXXXX XXXXX XXXXX                                    
XXX, XX 70000-80000                                         01/01/2000  To  01/31/2000                               XXXXX XX 00000                                       
                                                            Fax Number 100 100-1010                                                                                       
*** Note: DO NOT MAIL                                                                                                                                                     

 Id         Name         #_No       #_No_1       Co-Pay        Paid Amount       Admin Fees     Misc Charges         Amount Due
---------------------------------------------------------------------------------------------------------------------------------------------------------
Q10000000  QQQQQQ       330         144         4,740.24-         33,167.74             0.00           690.00          29,117.50 


Q10000001  YYYYYY       1,503        694        12,588.88-        269,629.24             0.00         2,880.00         259,920.36 


Q10000002  MMMMMM        106          20         2,395.50-         13,672.42             0.00           270.00          11,546.92 


Q10000003  NNNNNNN        481         174         5,947.75-         32,983.23             0.00           810.00          27,845.48 

XXX XXXXXXX                                                                                                       XXXXXX XXXXX XXXXX                                    
XXX, XX 70000-80000                                         02/01/2000  To  02/31/2000                               XXXXX XX 00000                                       
                                                            Fax Number 100 100-1010                                                                                       
*** Note: DO NOT MAIL                                                                                                                                                     

 Id         Name         #_No       #_No_1       Co-Pay        Paid Amount       Admin Fees     Misc Charges         Amount Due
---------------------------------------------------------------------------------------------------------------------------------------------------------
Q10000000  QQQQQQ       330         144         4,740.24-         33,167.74             0.00           690.00          29,117.50 


Q10000001  YYYYYY       1,503        694        12,588.88-        269,629.24             0.00         2,880.00         259,920.36 


Q10000002  MMMMMM        106          20         2,395.50-         13,672.42             0.00           270.00          11,546.92 


Q10000003  NNNNNNN        481         174         5,947.75-         32,983.23             0.00           810.00          27,845.48 

上面的报告将作为表格中的一列,并将报告拆分为多列。 列格式如下:

Id         Name         #_No       #_No_1       Co-Pay        Paid Amount       Admin Fees     Misc Charges       Amount Due     start_date     End_Date

Start_Date =01/01/2000 and 02/01/2000
end_date=01/31/2000 and 02/31/2000 

这些来自标题。如何在存储过程中提取这些列? 任何帮助表示赞赏。

1 个答案:

答案 0 :(得分:1)

借助两个UDF和一个Cross Apply(或两个)

创建示例数据

Declare @YourTable table (ID int,Report varchar(max))
Insert Into @YourTable values
(1,'
XX XXXXXXX                                                                                                       XXXXXX XXXXX XXXXX                                    
XXX, XX 70000-80000                                         01/01/2000  To  01/31/2000                               XXXXX XX 00000                                       
                                                            Fax Number 100 100-1010                                                                                       
*** Note: DO NOT MAIL                                                                                                                                                     

 Id         Name         #_No       #_No_1       Co-Pay        Paid Amount       Admin Fees     Misc Charges         Amount Due
---------------------------------------------------------------------------------------------------------------------------------------------------------
Q10000000  QQQQQQ       330         144         4,740.24-         33,167.74             0.00           690.00          29,117.50 


Q10000001  YYYYYY       1,503        694        12,588.88-        269,629.24             0.00         2,880.00         259,920.36 


Q10000002  MMMMMM        106          20         2,395.50-         13,672.42             0.00           270.00          11,546.92 


Q10000003  NNNNNNN        481         174         5,947.75-         32,983.23             0.00           810.00          27,845.48 

XXX XXXXXXX                                                                                                       XXXXXX XXXXX XXXXX                                    
XXX, XX 70000-80000                                         02/01/2000  To  02/31/2000                               XXXXX XX 00000                                       
                                                            Fax Number 100 100-1010                                                                                       
*** Note: DO NOT MAIL                                                                                                                                                     

 Id         Name         #_No       #_No_1       Co-Pay        Paid Amount       Admin Fees     Misc Charges         Amount Due
---------------------------------------------------------------------------------------------------------------------------------------------------------
Q10000000  QQQQQQ       330         144         4,740.24-         33,167.74             0.00           690.00          29,117.50 


Q10000001  YYYYYY       1,503        694        12,588.88-        269,629.24             0.00         2,880.00         259,920.36 


Q10000002  MMMMMM        106          20         2,395.50-         13,672.42             0.00           270.00          11,546.92 


Q10000003  NNNNNNN        481         174         5,947.75-         32,983.23             0.00           810.00          27,845.48 
')

查询示例

Select A.ID
      ,[Date1]       = left(B.Date,10)
      ,[Date2]       = right(B.Date,10)
      ,[Row ID]      = C.Pos1
      ,[Name]        = C.Pos2
      ,[#_No]        = try_convert(int,try_convert(money,C.Pos3))
      ,[#_No_1]      = try_convert(int,try_convert(money,C.Pos4))
      ,[Co-Pay]      = try_convert(money,IIF(right(Pos5,1)='-','-','')+replace(Pos5,'-',''))
      ,[Paid Amount] = try_convert(money,IIF(right(Pos6,1)='-','-','')+replace(Pos6,'-',''))
      ,[Admin Fees]  = try_convert(money,IIF(right(Pos7,1)='-','-','')+replace(Pos7,'-',''))
      ,[Misc Charges]= try_convert(money,IIF(right(Pos8,1)='-','-','')+replace(Pos8,'-',''))
      ,[Amount Due]  = try_convert(money,IIF(right(Pos9,1)='-','-','')+replace(Pos9,'-','')) 
 from @YourTable A
 Cross Apply (
               Select RetSeq,RetVal,Date = max(Date) over (Order by RetSeq)
               From (
                       Select *
                             ,Date = case when patindex('%__/__/____|To|__/__/____%',RetVal)=0 then '' else substring(RetVal,patindex('%__/__/____|To|__/__/____%',RetVal),24) end
                        From  [dbo].[udf-Str-Parse](replace(replace(replace(replace(A.Report,'  ','||'),'|','<>'),'><',''),'<>','|'),char(13)) 
                    ) B1
             ) B
 Cross Apply [dbo].[udf-Str-Parse-Row](B.RetVal,'|') C
 Where try_convert(money,Pos4) is not null
 Order By B.RetSeq

<强>返回

enter image description here

有兴趣的两个UDF

CREATE FUNCTION [dbo].[udf-Str-Parse] (@String varchar(max),@Delimiter varchar(10))
Returns Table 
As
Return (  
    Select RetSeq = Row_Number() over (Order By (Select null))
          ,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
    From  (Select x = Cast('<x>' + replace((Select replace(@String,@Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.')) as A 
    Cross Apply x.nodes('x') AS B(i)
);
--Thanks Shnugo for making this XML safe
--Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
--Select * from [dbo].[udf-Str-Parse]('this,is,<test>,for,< & >',',')

和...

CREATE FUNCTION [dbo].[udf-Str-Parse-Row] (@String varchar(max),@Delimiter varchar(10))
Returns Table 
As
Return (
    Select Pos1 = ltrim(rtrim(xDim.value('/x[1]','varchar(max)')))
          ,Pos2 = ltrim(rtrim(xDim.value('/x[2]','varchar(max)')))
          ,Pos3 = ltrim(rtrim(xDim.value('/x[3]','varchar(max)')))
          ,Pos4 = ltrim(rtrim(xDim.value('/x[4]','varchar(max)')))
          ,Pos5 = ltrim(rtrim(xDim.value('/x[5]','varchar(max)')))
          ,Pos6 = ltrim(rtrim(xDim.value('/x[6]','varchar(max)')))
          ,Pos7 = ltrim(rtrim(xDim.value('/x[7]','varchar(max)')))
          ,Pos8 = ltrim(rtrim(xDim.value('/x[8]','varchar(max)')))
          ,Pos9 = ltrim(rtrim(xDim.value('/x[9]','varchar(max)')))
    From  (Select Cast('<x>' + replace((Select replace(@String,@Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) as xDim) as A 
)
--Thanks Shnugo for making this XML safe
--Select * from [dbo].[udf-Str-Parse-Row]('Dog,Cat,House,Car',',')