SSIS删除不需要的字符

时间:2018-02-01 22:51:11

标签: sql sql-server sharepoint ssis etl

如何在文本

之间删除SSIS中的不需要的字符

即。我们有这样的数据

2134;#Adam Connor (aconnor),21987;#Tatanka Wabe (Twabe);# 

当它来自sharepoint时。我尝试了子串,替换等,但无法删除名称之间的数字。

我希望输出为

Adam Connor, Tatanka Kale

2 个答案:

答案 0 :(得分:1)

您可以使用正则表达式

注意:VB.NET中的代码

您需要在#(

之间提取字符串
Dim mc As MatchCollection = Regex.Matches(strContent, "(?<=\#)(.*?)(?=\()", RegexOptions.Singleline)

然后你需要用逗号分隔加入它们

String.Join(",", mc.Cast(Of Match)().Select(Function(m) m.Value))

SSIS版本 - 使用脚本组件

您可以使用脚本组件使用正则表达式实现此目的:

假设Column0是输入列,outColumn是输出列

Imports System  
Imports System.Data  
Imports System.Math  
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper  
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper  
Imports System.Text.RegularExpressions

<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute> _  
<CLSCompliant(False)> _  
Public Class ScriptMain  
    Inherits UserComponent  

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)  

        if Not Row.Column0_IsNull AndAlso _
           Not String.IsNullOrEmpty(Row.Column0.Trim) Then

            Dim strContent As String = Row.Column0

            Dim mc As MatchCollection = Regex.Matches(strContent, "(?<=\#)(.*?)(?=\()", RegexOptions.Singleline)

            Row.OutColumn = String.Join(",", mc.Cast(Of Match)().Select(Function(m) m.Value))

        Else 

            Row.OutColumn_IsNull = True

        End If

    End Sub  

End Class  

参考

答案 1 :(得分:1)

如果样本数据代表模式,并且您对表格值函数开放。

前段时间,厌倦了提取字符串(左,右,子字符串,charindex,patindex等),我修改了一个解析函数来接受两个不相似的参数。在这种情况下,

示例

Declare @YourTable table (ID int,SomeCol varchar(max))
Insert Into @YourTable values
(1,'2134;#Adam Connor (aconnor),21987;#Tatanka Wabe (Twabe);#')

Select A.ID
      ,B.*
 From  @YourTable A
 Cross Apply (
                Select NewVal = Stuff((Select ', ' +ltrim(rtrim(RetVal)) 
                                         From [dbo].[tvf-Str-Extract](A.SomeCol,'#','(') 
                                         For XML Path ('')
                                      ),1,2,'')
             ) B

<强>返回

ID  NewVal
1   Adam Connor, Tatanka Wabe

感兴趣的功能

CREATE FUNCTION [dbo].[tvf-Str-Extract] (@String varchar(max),@Delimiter1 varchar(100),@Delimiter2 varchar(100))
Returns Table 
As
Return (  

with   cte1(N)   As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
       cte2(N)   As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
       cte3(N)   As (Select 1 Union All Select t.N+DataLength(@Delimiter1) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter1)) = @Delimiter1),
       cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter1,@String,s.N),0)-S.N,8000) From cte3 S)

Select RetSeq = Row_Number() over (Order By N)
      ,RetPos = N
      ,RetVal = left(RetVal,charindex(@Delimiter2,RetVal)-1) 
 From  (
        Select *,RetVal = Substring(@String, N, L) 
         From  cte4
       ) A
 Where charindex(@Delimiter2,RetVal)>1

)
/*
Max Length of String 1MM characters

Declare @String varchar(max) = 'Dear [[FirstName]] [[LastName]], ...'
Select * From [dbo].[tvf-Str-Extract] (@String,'[[',']]')
*/
  

注意:

如果您只是简单地运行

Declare @YourTable table (ID int,SomeCol varchar(max))
Insert Into @YourTable values
(1,'2134;#Adam Connor (aconnor),21987;#Tatanka Wabe (Twabe);#')

Select A.ID
      ,B.*
 From  @YourTable A
 Cross Apply [dbo].[tvf-Str-Extract](A.SomeCol,'#','(')  B

你会得到

ID  RetSeq  RetPos  RetVal
1   1       7       Adam Connor 
1   2       36      Tatanka Wabe