我一直在处理sql表,并拆分数据。我来是从姓氏中分离出一些缩写。唯一的问题是,缩写的间隔。例如(我表中的数据)
Hanse J S P
> J S P
是缩写Gerson B D V
> B D V
是缩写J D Timberland
> J D
是缩写因此,基本上,最多可以有四个首字母,可以在字符串的开头,中间或结尾。我对如何导入这些内容一无所知。放入一个单独的列,结果将是:
COL A | COL B
J S P | Jansen
B D V | Gerson
J D | Timberland
有人可以指出正确的方向吗?我正在使用SQL Server。
答案 0 :(得分:2)
这是通过滥用Parsename
函数来做到这一点的方法。这里最大的警告是Parsename
限于4个令牌,因此J S P Jansen
可以工作,而J S P C Jansen
或John J S P Jansen
则不能。
With parsedname AS
(
SELECT
PARSENAME(replace(name, ' ', '.'), 1) name1,
PARSENAME(replace(name, ' ', '.'), 2) name2,
PARSENAME(replace(name, ' ', '.'), 3) name3,
PARSENAME(replace(name, ' ', '.'), 4) name4
FROM yourtable
)
SELECT
CASE WHEN LEN(name4) = 1 THEN name4 ELSE '' END +
CASE WHEN LEN(name3) = 1 THEN name3 ELSE '' END +
CASE WHEN LEN(name2) = 1 THEN name2 ELSE '' END +
CASE WHEN LEN(name1) = 1 THEN name1 ELSE '' END as initials,
CASE WHEN LEN(name1) > 1 THEN name1
WHEN LEN(name2) > 1 THEN name2
WHEN LEN(name3) > 1 THEN name3
WHEN LEN(name4) > 1 THEN name4
END as surname
FROM parsedname
Here is a sqlfiddle of this in action
CREATE TABLE NAMES (name varchar(50));
INSERT INTO NAMES VALUES ('J S P Jansen');
INSERT INTO NAMES VALUES ('B D V Gerson');
INSERT INTO NAMES VALUES ('J D Timberland');
With parsedname AS
(
SELECT
PARSENAME(replace(name, ' ', '.'), 1) name1,
PARSENAME(replace(name, ' ', '.'), 2) name2,
PARSENAME(replace(name, ' ', '.'), 3) name3,
PARSENAME(replace(name, ' ', '.'), 4) name4
FROM names
)
SELECT
CASE WHEN LEN(name4) = 1 THEN name4 ELSE '' END +
CASE WHEN LEN(name3) = 1 THEN name3 ELSE '' END +
CASE WHEN LEN(name2) = 1 THEN name2 ELSE '' END +
CASE WHEN LEN(name1) = 1 THEN name1 ELSE '' END as initials,
CASE WHEN LEN(name1) > 1 THEN name1
WHEN LEN(name2) > 1 THEN name2
WHEN LEN(name3) > 1 THEN name3
WHEN LEN(name4) > 1 THEN name4
END as surname
FROM parsedname
+----------+------------+
| initials | surname |
+----------+------------+
| JSP | Jansen |
| BDV | Gerson |
| JD | Timberland |
+----------+------------+
如果这些字母之间需要空格,则可以将该CASE语句翻转为类似以下内容:
TRIM(CASE WHEN LEN(name4) = 1 THEN name4 + ' ' ELSE '' END +
CASE WHEN LEN(name3) = 1 THEN name3 + ' ' ELSE '' END +
CASE WHEN LEN(name2) = 1 THEN name2 + ' ' ELSE '' END +
CASE WHEN LEN(name1) = 1 THEN name1 + ' ' ELSE '' END) as initials
+----------+------------+
| initials | surname |
+----------+------------+
| J S P | Jansen |
| B D V | Gerson |
| J D | Timberland |
+----------+------------+
答案 1 :(得分:1)
与JNevil的答案(+1)类似,但不限于4个令牌。
示例
Declare @YourTable table (SomeCol varchar(50))
Insert Into @YourTable values
('Hanse J S P')
,('Gerson B D V')
,('J D Timberland')
,('J D Timberland / J R R Tolkien')
Select A.SomeCol
,ColA = ltrim(
concat(IIF(len(Pos1)=1,' '+Pos1,null)
,IIF(len(Pos2)=1,' '+Pos2,null)
,IIF(len(Pos3)=1,' '+Pos3,null)
,IIF(len(Pos4)=1,' '+Pos4,null)
,IIF(len(Pos5)=1,' '+Pos5,null)
,IIF(len(Pos6)=1,' '+Pos6,null)
,IIF(len(Pos7)=1,' '+Pos7,null)
,IIF(len(Pos8)=1,' '+Pos8,null)
,IIF(len(Pos9)=1,' '+Pos9,null)
)
)
,ColB = ltrim(
concat(IIF(Pos1 not Like '[a-z]',' '+Pos1,null)
,IIF(Pos2 not Like '[a-z]',' '+Pos2,null)
,IIF(Pos3 not Like '[a-z]',' '+Pos3,null)
,IIF(Pos4 not Like '[a-z]',' '+Pos4,null)
,IIF(Pos5 not Like '[a-z]',' '+Pos5,null)
,IIF(Pos6 not Like '[a-z]',' '+Pos6,null)
,IIF(Pos7 not Like '[a-z]',' '+Pos7,null)
,IIF(Pos8 not Like '[a-z]',' '+Pos8,null)
,IIF(Pos9 not Like '[a-z]',' '+Pos9,null)
)
)
From @YourTable A
Cross Apply (
Select Pos1 = xDim.value('/x[1]','varchar(max)')
,Pos2 = xDim.value('/x[2]','varchar(max)')
,Pos3 = xDim.value('/x[3]','varchar(max)')
,Pos4 = xDim.value('/x[4]','varchar(max)')
,Pos5 = xDim.value('/x[5]','varchar(max)')
,Pos6 = xDim.value('/x[6]','varchar(max)')
,Pos7 = xDim.value('/x[7]','varchar(max)')
,Pos8 = xDim.value('/x[8]','varchar(max)')
,Pos9 = xDim.value('/x[9]','varchar(max)')
From (Select Cast('<x>' + replace(SomeCol,' ','</x><x>')+'</x>' as xml) as xDim) as A
) B
返回
SomeCol ColA ColB
Hanse J S P J S P Hanse
Gerson B D V B D V Gerson
J D Timberland J D Timberland
J D Timberland / J R R Tolkien J D / J R R Timberland / Tolkien
答案 2 :(得分:1)
我为此使用了一些内置函数。一般的想法是使用string_split
将字符串分成几行,使用ROW_NUMBER
根据长度和字符串中的字符位置保存顺序,然后使用FOR XML PATH()
从行连接到单个列。
--Assume your data structure
DECLARE @temp TABLE (thestring varchar(1000))
INSERT INTO @temp VALUES
('Hanse J S P'), ('Gerson B D V'), ('J D Timberland')
;WITH CTE AS
(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY thestring ORDER BY thestring, LEN(value) ASC, pos ASC) [order]
FROM (
SELECT *
, value AS [theval]
, CHARINDEX(CASE WHEN len(value) = 1 THEN ' ' + value ELSE value END, thestring) AS [pos]
FROM @temp CROSS APPLY string_split(thestring, ' ')
) AS dT
)
SELECT ( SELECT value + ' ' AS [text()]
FROM cte
WHERE cte.thestring = T.thestring
AND LEN(theval) = 1
FOR XML PATH('')
) AS [COL A]
,( SELECT value + ' ' AS [text()]
FROM cte
WHERE cte.thestring = T.thestring
AND LEN(theval) > 1
FOR XML PATH('')
) AS [COL B]
FROM @temp T
GROUP BY thestring
产生输出:
COL A COL B
----- -----
B D V Gerson
J S P Hanse
J D Timberland
答案 3 :(得分:1)
此人使用CHARINDEX
和递归CTE从名称中提取以空格分隔的子字符串:
一旦有了子字符串,只需将它们粘回去即可。
WITH yourdata(FullName) AS (
SELECT 'Hanse J S P' UNION
SELECT 'Gerson B D V' UNION
SELECT 'J D Timberland' UNION
SELECT 'TEST 1 TEST 2 TEST 3'
), cte AS (
SELECT
FullName,
CASE WHEN Pos1 = 0 THEN FullName ELSE SUBSTRING(FullName, 1, Pos1 - 1) END AS LeftPart,
CASE WHEN Pos1 = 0 THEN Null ELSE SUBSTRING(FullName, Pos1 + 1, Pos2 - Pos1) END AS NextPart,
1 AS PartSort
FROM yourdata
CROSS APPLY (SELECT CHARINDEX(' ', FullName) AS Pos1, LEN(FullName) AS Pos2) AS CA
UNION ALL
SELECT
FullName,
CASE WHEN Pos1 = 0 THEN NextPart ELSE SUBSTRING(NextPart, 1, Pos1 - 1) END,
CASE WHEN Pos1 = 0 THEN Null ELSE SUBSTRING(NextPart, Pos1 + 1, Pos2 - Pos1) END,
PartSort + 1
FROM cte
CROSS APPLY (SELECT CHARINDEX(' ', NextPart) AS Pos1, LEN(NextPart) AS Pos2) AS CA
WHERE NextPart IS NOT NULL
)
SELECT yourdata.FullName, STUFF(CA1.XMLStr, 1, 1, '') AS Initials, STUFF(CA2.XMLStr, 1, 1, '') AS Names
FROM yourdata
CROSS APPLY (
SELECT CONCAT(' ', LeftPart)
FROM cte
WHERE FullName = yourdata.FullName AND LEN(LeftPart) = 1
ORDER BY PartSort
FOR XML PATH('')
) AS CA1(XMLStr)
CROSS APPLY (
SELECT CONCAT(' ', LeftPart)
FROM cte
WHERE FullName = yourdata.FullName AND LEN(LeftPart) > 1
ORDER BY PartSort
FOR XML PATH('')
) AS CA2(XMLStr)
结果:
| FullName | Initials | Names |
|----------------------|----------|----------------|
| Gerson@B@D@V | B D V | Gerson |
| Hanse@J@S@P | J S P | Hanse |
| J@D@Timberland | J D | Timberland |
| TEST@1@TEST@2@TEST@3 | 1 2 3 | TEST TEST TEST |
答案 4 :(得分:0)
您拥有哪个版本的SQL Server? STRING_SPLIT()是否可用?
如果是,请使用空格作为定界符分割,遍历结果字符串,评估其长度,并在结果字符串为一个字符长度且为字母时将结果字符串与该字符串连接。
除非结果字符串到目前为止为空,否则请在前面添加一个空格。
如果STRING_SPLIT()不可用...那么...这是一些解决方案:
T-SQL split string based on delimiter
-附录
对于问题的第二部分(我最初发布我的答复时最初不存在),您希望将非缩写部分隔离到第二列中,我基本上将两个逻辑块分离为两个结果基于每个元素的长度的字符串。
注意:这在2016年前的SQL Server中不会很优雅,甚至可能需要 CURSOR (叹气)
我知道我会因为提到一个游标而感到沮丧。