用自定义分隔符拆分一个非常大的字符串?

时间:2011-06-23 21:16:34

标签: sql sql-server sql-server-2008

我正在尝试确定VARCHAR(3000)列中的单词频率。我不确定这是否是最好的数据类型,但表格创建不在手。无论如何,我一直在使用以下函数(取自here)来分割字符串直到这一点:

CREATE FUNCTION dbo.Split
(
    @RowData nvarchar(2000),
    @SplitOn nvarchar(5)
)  
RETURNS @RtnValue table 
(
    Id int identity(1,1),
    Data nvarchar(100)
) 
AS  
BEGIN 
    Declare @Cnt int
    Set @Cnt = 1

    While (Charindex(@SplitOn,@RowData)>0)
    Begin
        Insert Into @RtnValue (data)
        Select 
            Data = ltrim(rtrim(Substring(@RowData,1,Charindex(@SplitOn,@RowData)-1)))

        Set @RowData = Substring(@RowData,Charindex(@SplitOn,@RowData)+1,len(@RowData))
        Set @Cnt = @Cnt + 1
    End

    Insert Into @RtnValue (data)
    Select Data = ltrim(rtrim(@RowData))

    Return
END

用法如下:

SELECT s FROM dbo.Split(' ', @description)

它工作得非常好,但现在我收到了错误:

  

声明终止。最大值   递归100已经筋疲力尽   在声明完成之前。

有没有人就什么是实现这个目标的好方法提出建议?

3 个答案:

答案 0 :(得分:2)

没关系。为了防止其他人遇到同样的问题,here中的以下内容适用于大字符串:

CREATE FUNCTION dbo.SplitLarge(@String varchar(8000), @Delimiter char(1))     
returns @temptable TABLE (items varchar(8000))     
as     
begin     
    declare @idx int     
    declare @slice varchar(8000)     

    select @idx = 1     
        if len(@String)<1 or @String is null  return     

    while @idx!= 0     
    begin     
        set @idx = charindex(@Delimiter,@String)     
        if @idx!=0     
            set @slice = left(@String,@idx - 1)     
        else     
            set @slice = @String     

        if(len(@slice)>0)
            insert into @temptable(Items) values(@slice)     

        set @String = right(@String,len(@String) - @idx)     
        if len(@String) = 0 break     
    end 
return     
end

答案 1 :(得分:2)

此函数taken from here使用.Nodes并避免循环和递归CTES

CREATE FUNCTION dbo.Split(@data NVARCHAR(MAX), @delimiter NVARCHAR(5))
RETURNS @t TABLE (data NVARCHAR(max))
AS
BEGIN

    DECLARE @textXML XML;
    SELECT    @textXML = CAST('<d>' + REPLACE(@data, @delimiter, '</d><d>') + '</d>' AS XML);

    INSERT INTO @t(data)
    SELECT  T.split.value('.', 'nvarchar(max)') AS data
    FROM    @textXML.nodes('/d') T(split)

    RETURN
END
GO

答案 2 :(得分:0)

我遇到以下代码适合我的情况。该代码使用了replace函数和SQL 2008使用单个insert语句插入多行的能力。这种方法唯一的缺点,如果真的是一个缺点,它只限于1000次分裂。

    IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[StringSegmenter]') AND type in (N'U'))
DROP TABLE [dbo].[StringSegmenter]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[StringSegmenter]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[StringSegmenter](
    [ss_id] [int] IDENTITY(1,1) NOT NULL,
    [ss_segment] [varchar](max) NOT NULL,
 CONSTRAINT [PK_StringSegmenter] PRIMARY KEY CLUSTERED 
(
    [ss_id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
END
GO

SET ANSI_PADDING OFF
GO

truncate table scratchpad.dbo.stringsegmenter
declare @String varchar(max)
declare @Splitter varchar(10)
set @String = '1,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,2222222222222222222,3,4,55555555555555555555555,2222222222222222222,3,4,55555555555555555555555,666666666666666666666666666666,2222222222222222222,3,4,55555555555555555555555,666666666666666666666666666666,2222222222222222222,3,4,55555555555555555555555,666666666666666666666666666666,2222222222222222222,3,4,55555555555555555555555,666666666666666666666666666666'
set @Splitter = ','
set @String = 'Insert [dbo].[StringSegmenter] Values (''' + replace(@string, @splitter,'''),(''') + ''')'
select @String
execute (@String)

Select * from [dbo].[StringSegmenter] order by ss_id