如何在文本
之间删除SSIS中的不需要的字符即。我们有这样的数据
2134;#Adam Connor (aconnor),21987;#Tatanka Wabe (Twabe);#
当它来自sharepoint时。我尝试了子串,替换等,但无法删除名称之间的数字。
我希望输出为
Adam Connor, Tatanka Kale
答案 0 :(得分:1)
注意:VB.NET中的代码
您需要在#
和(
Dim mc As MatchCollection = Regex.Matches(strContent, "(?<=\#)(.*?)(?=\()", RegexOptions.Singleline)
然后你需要用逗号分隔加入它们
String.Join(",", mc.Cast(Of Match)().Select(Function(m) m.Value))
您可以使用脚本组件使用正则表达式实现此目的:
假设Column0
是输入列,outColumn
是输出列
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Imports System.Text.RegularExpressions
<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute> _
<CLSCompliant(False)> _
Public Class ScriptMain
Inherits UserComponent
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
if Not Row.Column0_IsNull AndAlso _
Not String.IsNullOrEmpty(Row.Column0.Trim) Then
Dim strContent As String = Row.Column0
Dim mc As MatchCollection = Regex.Matches(strContent, "(?<=\#)(.*?)(?=\()", RegexOptions.Singleline)
Row.OutColumn = String.Join(",", mc.Cast(Of Match)().Select(Function(m) m.Value))
Else
Row.OutColumn_IsNull = True
End If
End Sub
End Class
答案 1 :(得分:1)
如果样本数据代表模式,并且您对表格值函数开放。
前段时间,厌倦了提取字符串(左,右,子字符串,charindex,patindex等),我修改了一个解析函数来接受两个不相似的参数。在这种情况下,#和(
示例强>
Declare @YourTable table (ID int,SomeCol varchar(max))
Insert Into @YourTable values
(1,'2134;#Adam Connor (aconnor),21987;#Tatanka Wabe (Twabe);#')
Select A.ID
,B.*
From @YourTable A
Cross Apply (
Select NewVal = Stuff((Select ', ' +ltrim(rtrim(RetVal))
From [dbo].[tvf-Str-Extract](A.SomeCol,'#','(')
For XML Path ('')
),1,2,'')
) B
<强>返回强>
ID NewVal
1 Adam Connor, Tatanka Wabe
感兴趣的功能
CREATE FUNCTION [dbo].[tvf-Str-Extract] (@String varchar(max),@Delimiter1 varchar(100),@Delimiter2 varchar(100))
Returns Table
As
Return (
with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
cte3(N) As (Select 1 Union All Select t.N+DataLength(@Delimiter1) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter1)) = @Delimiter1),
cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter1,@String,s.N),0)-S.N,8000) From cte3 S)
Select RetSeq = Row_Number() over (Order By N)
,RetPos = N
,RetVal = left(RetVal,charindex(@Delimiter2,RetVal)-1)
From (
Select *,RetVal = Substring(@String, N, L)
From cte4
) A
Where charindex(@Delimiter2,RetVal)>1
)
/*
Max Length of String 1MM characters
Declare @String varchar(max) = 'Dear [[FirstName]] [[LastName]], ...'
Select * From [dbo].[tvf-Str-Extract] (@String,'[[',']]')
*/
注意:强>
如果您只是简单地运行
Declare @YourTable table (ID int,SomeCol varchar(max))
Insert Into @YourTable values
(1,'2134;#Adam Connor (aconnor),21987;#Tatanka Wabe (Twabe);#')
Select A.ID
,B.*
From @YourTable A
Cross Apply [dbo].[tvf-Str-Extract](A.SomeCol,'#','(') B
你会得到
ID RetSeq RetPos RetVal
1 1 7 Adam Connor
1 2 36 Tatanka Wabe