问题:
我将文本数据导入到数据库中,其中包含许多不需要的字符。我需要在导入的文本字符串中只保留4个大写字母字符串。例如:
1447;#MIBD (This is a nice name);#2056;#LKRE (Very nice name indeed)
这可能在我桌子的一行中的一列中。我需要从字符串中提取的是:
MIBD and LKRE
结果最好是用分号分隔的所需字符串。
它应该应用于整个列,我不知道这4个大写字母字符串中有多少可能出现在一行中。
经历了像PATINDEX等各种各样的功能,但实际上不知道如何处理它。谢谢你的帮助!
答案 0 :(得分:0)
试试这个,它假设四个字符代码总是在前面加上;#。由于PATINDEX不区分大小写,我添加了额外的检查以验证所有四个字符都是大写。
DECLARE @MyTable Table( ID INT, MyString VARCHAR(8000))
INSERT INTO @MyTable
VALUES
(1, '1447;#MIBD (This is a nice name);#2056;#LKRE (Very nice name indeed)')
,(2, ';#DBCC (This is a nice name);#2056;#LLC (Very nice name indeed) ;#ABCD')
,(3, ';#AaaA;#OPQR;1234 (and) ;#WXYZ')
,(4, ';#abc this empty string without any code')
;WITH CTE AS
(
SELECT ID
,SUBSTRING(MyString, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+2, 4) AS NewString
,STUFF(MyString, 1, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+6, '') AS MyString
FROM @MyTable m
WHERE PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString) > 0
UNION ALL
SELECT ID
,SUBSTRING(MyString, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+2, 4) AS NewString
,STUFF(MyString, 1, PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString)+6, '') AS MyString
FROM CTE c
WHERE PATINDEX('%;#[A-Z][A-Z][A-Z][A-Z]%',MyString) > 0
)
SELECT c.ID,
STUFF(( SELECT '; ' + NewString
FROM CTE c1
WHERE c1.ID = c.ID
AND ASCII(SUBSTRING(NewString, 1, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- first char
AND ASCII(SUBSTRING(NewString, 2, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- second char
AND ASCII(SUBSTRING(NewString, 3, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- third char
AND ASCII(SUBSTRING(NewString, 4, 1)) BETWEEN ASCII('A') AND ASCII('Z') -- fourth char
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)') -- use the value clause to hanlde xml character issue like, &,",>,<
,1,1,'') AS CodeList
FROM CTE c
GROUP BY ID
OPTION (MAXRECURSION 0);
答案 1 :(得分:-1)
到目前为止,我发现了这样的事情:
ALTER FUNCTION CleanData
(
-- Parameters here
@Text AS VARCHAR(4000)
)
RETURNS VARCHAR(4000)
AS
BEGIN
WHILE PATINDEX('%[0-9#;()]%', @Text) > 0
BEGIN
SET @Text = STUFF(@Text, PATINDEX('%[0-9#;()]%', @Text), 1, '')
END
RETURN @Text
END
但我得到的是姓名缩写和parantheses中的字符,因为PATINDEX在大写和小写之间无法区分。也许它可能对其他人有帮助