我编写了以下查询以获取id作为输入,从Attachment表中获取DocumentID,然后使用该id从Document表中获取Document name。一旦我得到文件名,我除去了字符a-z和数字。如果基于实体ID仅返回一个Document id,则下面的Query工作正常,如果一个实体id返回多个Document ID,我该如何使它工作。我还需要返回所有这些新名称。
ALTER PROCEDURE [dbo].[NormalizeDocumentFileName1]
-- Add the parameters for the stored procedure here
@id nvarchar(16),
@temp varchar(50) OUTPUT
AS
BEGIN
Select @temp=Document.TheName from Document where id = (Select DocumentId from Attachment where EntityId = @id)
Declare @KeepValues as varchar(50)
Set @KeepValues = '%[^a-z0-9-_.]%'
While PatIndex(@KeepValues, @temp) > 0
Set @temp = Stuff(@temp, PatIndex(@KeepValues, @temp), 1, '')
END
答案 0 :(得分:2)
就个人而言,我会采用一种截然不同的方法。我将使用Alan Burstein的NGrams8K。
你想避免WHILE
循环,它会表现得非常好,并采用数据集方法。我将改为使用函数:
CREATE FUNCTION NormalizeDocumentFileName (@FileName varchar(50) )
RETURNS TABLE
AS RETURN
WITH Tokens AS (
SELECT *
FROM dbo.NGrams8k (@FileName,1) --If you didn't create the function on the dbo schema, you'll need to change it.
WHERE token NOT LIKE '%[^a-z0-9-_.]%')
SELECT CONVERT(varchar(50),(SELECT Token + ''
FROM Tokens
ORDER BY Position
FOR XML PATH(''))) AS NormalFileName;
GO
然后你可以做一些简单的事情:
SELECT D.YourColumn, NDFN.NormalFileName
FROM Document D
CROSS APPLY NormalizeDocumentFileName(D.TheName) NDFN;
答案 1 :(得分:0)
嗯。您可以省略while
循环并使用递归CTE:
with cte as (
Select d.TheName, 0 as lev, d.TheName as orig_TheName
from Document d
where d.id = (Select DocumentId from Attachment where EntityId = @id)
union all
select Stuff(cte.thename, PatIndex(@KeepValues, cte.thename), 1, '') as DocumentId lev + 1, cte.orig_TheName
from cte
where PatIndex(@KeepValues, cte.thename) > 0
)
select theName
from (select theName, max(lev) over (partition by orig_thename) as max_lev
from cte
) x
where lev = max_lev
答案 2 :(得分:0)
这种类型的事物的另一个基于集合的函数是PatExclude8K,该函数与Larnu放在一起并且可重用的功能相同。您必须使用该链接获取T-SQL代码才能创建该功能。功能如下:
DECLARE @string varchar(50) = '$$$123___!!!555.ABC???';
SELECT * FROM dbo.patexclude8k(@string, '[^A-Za-z0-9-_.]');
返回:
NewString
------------
123___555.ABC
请注意,LARNU放在一起将返回XML字符的实体引用,例如“&”,“>”等。但它的效果会比Patexclude好。如果您不希望处理特殊的XML字符,您可以使用稍微修改后的版本,它将执行相对相同的操作 - 这就是:
CREATE FUNCTION dbo.PatExclude8K_NXP
(
@String VARCHAR(8000),
@Pattern VARCHAR(50)
)
/*******************************************************************************
Purpose:
Given a string (@String) and a pattern (@Pattern) of characters to remove,
remove the patterned characters from the string.
Usage:
--===== Basic Syntax Example
SELECT CleanedString
FROM dbo.PatExclude8K_NXP(@String,@Pattern);
--===== Remove all but Alpha characters
SELECT CleanedString
FROM dbo.SomeTable st
CROSS APPLY dbo.PatExclude8K(st.SomeString,'%[^A-Za-z]%');
--===== Remove all but Numeric digits
SELECT CleanedString
FROM dbo.SomeTable st
CROSS APPLY dbo.PatExclude8K(st.SomeString,'%[^0-9]%');
Programmer Notes:
1. @Pattern is case sensitive (the function can be easily modified to make it so)
2. There is no need to include the "%" before and/or after your pattern since since we
are evaluating each character individually
Revision History:
Rev 00 - 20180508 Initial Development - Alan Burstein
*******************************************************************************/
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
E1(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N)),
itally(N) AS
(
SELECT TOP(CONVERT(INT,LEN(@String),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM E1 T1 CROSS JOIN E1 T2 CROSS JOIN E1 T3 CROSS JOIN E1 T4
)
SELECT NewString =
(
SELECT SUBSTRING(@String,N,1)
FROM iTally
WHERE 0 = PATINDEX(@Pattern,SUBSTRING(@String COLLATE Latin1_General_BIN,N,1))
FOR XML PATH('')
);
最后,当优化器选择并行执行计划时,NGrams8K和PatExclude都会表现得更好。要强制执行并行计划,您可以使用Adam Machanic的Make_parallel。以Larnu的解决方案为例,您可以强制执行并行计划:
SELECT D.YourColumn, NDFN.NormalFileName
FROM Document D
CROSS APPLY NormalizeDocumentFileName(D.TheName) NDFN;
CROSS APPLY dbo.make_parallel();