如何在SQL Server中将一段文本划分为单个句子?

时间:2016-07-24 14:53:28

标签: sql-server parsing text

我有一个包含多个句子的文本块,我想将它们拆分并在新行上显示每个句子。我已经尝试过使用CHARINDEX和SUBSTRING但是一旦你通过了第二句话,代码变得非常复杂并且难以重复。在这里我得到了两个并且放弃了意识到代码将快速滚雪球:

DECLARE @TEXT NVARCHAR(MAX) = 'Has many applications. The price is low. The quality is good. Availability is widespread.'
DECLARE @TEXTLine1 NVARCHAR(MAX) = LEFT(@TEXT,CHARINDEX('.',@TEXT))
DECLARE @TEXTLine2 NVARCHAR(MAX) = SUBSTRING(@TEXT,CHARINDEX('.',@TEXT)+2,CHARINDEX('.',SUBSTRING(@TEXT,CHARINDEX('.',@TEXT)+2,50)))
PRINT @TEXTLine1
PRINT @TEXTLine2

正如你所看到的,我正在根据句号分割句子。有没有办法告诉SUBSTRING找到第五个'一个角色的实例?这将使任务变得简单。

3 个答案:

答案 0 :(得分:2)

使用here ..

中的一个拆分字符串函数

那么这样做很容易..

DECLARE @TEXT NVARCHAR(MAX) = 'Has many applications. The price is low. The quality is good. Availability is widespread.'

select * from
[dbo].[SplitStrings_Numbers](@text,'.')

输出

Item
 Has many applications
 The price is low
 The quality is good
 Availability is widespread

答案 1 :(得分:0)

在考虑我的问题时,我提出了这个问题 - 看起来有点矫枉过正,但我​​无法想到任何其他方式:

DECLARE @text NVARCHAR(MAX) = 'Has many applications. The price is low. The quality is good. Availability is widespread.'; --Set text.
DECLARE @text2 NVARCHAR(MAX) = LEFT(@text,CHARINDEX('.',@text)); --Extract first sentence.

PRINT @text2; --Print first sentence.

SET @text = RIGHT(@text,LEN(@text)-LEN(@text2)); --Subtract @text2 from @text - will include the space at the begining that was after the first full stop.

WHILE LEN(@text) >0 
BEGIN 
SET @text = RIGHT(@text, LEN(@text)-1);--Take of the space that after the full stop in previous iteration of @text.
SET @text2 = LEFT(@text,CHARINDEX('.',@text));--Exract the 'new' first sentence.
PRINT @text2; 
SET @text = RIGHT(@text,LEN(@text)-LEN(@text2)); --Subtract @text2 from @text - will include the space at the begining that was after the first full stop.
END;

欢迎提出其他建议。

编辑 - 约翰卡佩莱蒂的出色回答激励我改进自己。它现在不会被小数点击,也可以识别回车。将有一个不同的应用程序,但认为包括任何寻找任何解决方案的人可能是有用的。

DECLARE @text NVARCHAR(MAX) = 
'I would like to pay £22.99. Has many applications. 
£45.00 is good value. The price is low. The quality is good. 
Availability is widespread. Good value at £5.00.' --Set text.
DECLARE @text2 NVARCHAR(MAX) = LEFT(@text,CHARINDEX('. ',@text)) --Extract first sentence.


PRINT @text2 --Print first sentence.
SET @text = RIGHT(@text,LEN(@text)-LEN(@text2)) --Subtract @text2 from @text - will include the space at the begining that was after the first full stop.
WHILE LEN(@text) >0 
BEGIN 
SET @text = RIGHT(@text, LEN(@text)-1)--Take off the space that after the full stop in previous iteration of @text.
SET @text2 = IIF(LEN(LEFT(@text,CHARINDEX('. ',@text)))=0, LEFT(@text,CHARINDEX('.',@text)+LEN(RIGHT(@text, LEN(@text)-LEN(LEFT(@text,CHARINDEX('.',@text)))))),LEFT(@text,CHARINDEX('. ',@text)))--Extract the new first sentence.
PRINT @text2 
SET @text = RIGHT(@text,LEN(@text)-LEN(@text2)) --Subtract @text2 from @text - will include the space at the begining that was after the first full stop.
END

答案 2 :(得分:0)

借助解析功能

Declare @TEXT VarChar(max) = 'Price if $15.25 is NOT split. Has many applications. The price is low. The quality is good. Availability is widespread.'

Select * from [dbo].[udf-Str-Parse](@Text,'. ')

返回

Key_PS  Key_Value
1       Price if $15.25 is NOT split
2       Has many applications
3       The price is low
4       The quality is good
5       Availability is widespread.

UDF

CREATE FUNCTION [dbo].[udf-Str-Parse] (@String varchar(max),@delimeter varchar(10))
--Usage: Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',')
--       Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
--       Select * from [dbo].[udf-Str-Parse]('id26,id46|id658,id967','|')

Returns @ReturnTable Table (Key_PS int IDENTITY(1,1) NOT NULL , Key_Value varchar(max))

As

Begin
   Declare @intPos int,@SubStr varchar(max)
   Set @IntPos = CharIndex(@delimeter, @String)
   Set @String = Replace(@String,@delimeter+@delimeter,@delimeter)
   While @IntPos > 0
      Begin
         Set @SubStr = Substring(@String, 0, @IntPos)
         Insert into @ReturnTable (Key_Value) values (@SubStr)
         Set @String = Replace(@String, @SubStr + @delimeter, '')
         Set @IntPos = CharIndex(@delimeter, @String)
      End
   Insert into @ReturnTable (Key_Value) values (@String)
   Return 
End