从SQL Server表中检索缩写

时间:2018-09-10 17:01:14

标签: sql sql-server

我一直在处理sql表,并拆分数据。我来是从姓氏中分离出一些缩写。唯一的问题是,缩写的间隔。例如(我表中的数据)

  • Hanse J S P> J S P是缩写
  • Gerson B D V> B D V是缩写
  • J D Timberland> J D是缩写

因此,基本上,最多可以有四个首字母,可以在字符串的开头,中间或结尾。我对如何导入这些内容一无所知。放入一个单独的列,结果将是:

COL A | COL B
J S P | Jansen
B D V | Gerson
J D   | Timberland

有人可以指出正确的方向吗?我正在使用SQL Server。

5 个答案:

答案 0 :(得分:2)

这是通过滥用Parsename函数来做到这一点的方法。这里最大的警告是Parsename限于4个令牌,因此J S P Jansen可以工作,而J S P C JansenJohn J S P Jansen则不能。

With parsedname AS
(
  SELECT
      PARSENAME(replace(name, ' ', '.'), 1) name1,
      PARSENAME(replace(name, ' ', '.'), 2) name2,
      PARSENAME(replace(name, ' ', '.'), 3) name3,
      PARSENAME(replace(name, ' ', '.'), 4) name4
   FROM yourtable
)
SELECT 
   CASE WHEN LEN(name4) = 1 THEN name4 ELSE '' END +
      CASE WHEN LEN(name3) = 1 THEN name3 ELSE '' END +
      CASE WHEN LEN(name2) = 1 THEN name2 ELSE '' END +
      CASE WHEN LEN(name1) = 1 THEN name1 ELSE '' END as initials,
   CASE WHEN LEN(name1) > 1 THEN name1 
      WHEN LEN(name2) > 1 THEN name2
      WHEN LEN(name3) > 1 THEN name3
      WHEN LEN(name4) > 1 THEN name4
      END as surname
FROM parsedname

Here is a sqlfiddle of this in action

CREATE TABLE NAMES (name varchar(50));
INSERT INTO NAMES VALUES ('J S P Jansen');
INSERT INTO NAMES VALUES ('B D V Gerson');
INSERT INTO NAMES VALUES ('J D Timberland');

With parsedname AS
(
  SELECT
      PARSENAME(replace(name, ' ', '.'), 1) name1,
      PARSENAME(replace(name, ' ', '.'), 2) name2,
      PARSENAME(replace(name, ' ', '.'), 3) name3,
      PARSENAME(replace(name, ' ', '.'), 4) name4
   FROM names
)
SELECT 
   CASE WHEN LEN(name4) = 1 THEN name4 ELSE '' END +
      CASE WHEN LEN(name3) = 1 THEN name3 ELSE '' END +
      CASE WHEN LEN(name2) = 1 THEN name2 ELSE '' END +
      CASE WHEN LEN(name1) = 1 THEN name1 ELSE '' END as initials,
   CASE WHEN LEN(name1) > 1 THEN name1 
      WHEN LEN(name2) > 1 THEN name2
      WHEN LEN(name3) > 1 THEN name3
      WHEN LEN(name4) > 1 THEN name4
      END as surname
FROM parsedname

+----------+------------+
| initials |  surname   |
+----------+------------+
| JSP      | Jansen     |
| BDV      | Gerson     |
| JD       | Timberland |
+----------+------------+

如果这些字母之间需要空格,则可以将该CASE语句翻转为类似以下内容:

TRIM(CASE WHEN LEN(name4) = 1 THEN name4 + ' ' ELSE '' END +
      CASE WHEN LEN(name3) = 1 THEN name3 + ' ' ELSE '' END +
      CASE WHEN LEN(name2) = 1 THEN name2 + ' ' ELSE '' END +
      CASE WHEN LEN(name1) = 1 THEN name1 + ' ' ELSE '' END) as initials

SQLFiddle with the spaces

+----------+------------+
| initials |  surname   |
+----------+------------+
| J S P    | Jansen     |
| B D V    | Gerson     |
| J D      | Timberland |
+----------+------------+

答案 1 :(得分:1)

与JNevil的答案(+1)类似,但不限于4个令牌。

示例

Declare @YourTable table (SomeCol varchar(50))
Insert Into @YourTable values
 ('Hanse J S P')
,('Gerson B D V')
,('J D Timberland')
,('J D Timberland / J R R Tolkien')


Select A.SomeCol
      ,ColA = ltrim(
              concat(IIF(len(Pos1)=1,' '+Pos1,null)
                    ,IIF(len(Pos2)=1,' '+Pos2,null)
                    ,IIF(len(Pos3)=1,' '+Pos3,null)
                    ,IIF(len(Pos4)=1,' '+Pos4,null)
                    ,IIF(len(Pos5)=1,' '+Pos5,null)
                    ,IIF(len(Pos6)=1,' '+Pos6,null)
                    ,IIF(len(Pos7)=1,' '+Pos7,null)
                    ,IIF(len(Pos8)=1,' '+Pos8,null)
                    ,IIF(len(Pos9)=1,' '+Pos9,null)
                    )
              )
      ,ColB = ltrim(
              concat(IIF(Pos1 not Like '[a-z]',' '+Pos1,null)
                    ,IIF(Pos2 not Like '[a-z]',' '+Pos2,null)
                    ,IIF(Pos3 not Like '[a-z]',' '+Pos3,null)
                    ,IIF(Pos4 not Like '[a-z]',' '+Pos4,null)
                    ,IIF(Pos5 not Like '[a-z]',' '+Pos5,null)
                    ,IIF(Pos6 not Like '[a-z]',' '+Pos6,null)
                    ,IIF(Pos7 not Like '[a-z]',' '+Pos7,null)
                    ,IIF(Pos8 not Like '[a-z]',' '+Pos8,null)
                    ,IIF(Pos9 not Like '[a-z]',' '+Pos9,null)
                    )
              )
 From  @YourTable A
 Cross Apply (
                Select Pos1 = xDim.value('/x[1]','varchar(max)')
                      ,Pos2 = xDim.value('/x[2]','varchar(max)')
                      ,Pos3 = xDim.value('/x[3]','varchar(max)')
                      ,Pos4 = xDim.value('/x[4]','varchar(max)')
                      ,Pos5 = xDim.value('/x[5]','varchar(max)')
                      ,Pos6 = xDim.value('/x[6]','varchar(max)')
                      ,Pos7 = xDim.value('/x[7]','varchar(max)')
                      ,Pos8 = xDim.value('/x[8]','varchar(max)')
                      ,Pos9 = xDim.value('/x[9]','varchar(max)')
                From  (Select Cast('<x>' + replace(SomeCol,' ','</x><x>')+'</x>' as xml) as xDim) as A 
             ) B

返回

SomeCol                           ColA            ColB
Hanse J S P                       J S P           Hanse
Gerson B D V                      B D V           Gerson
J D Timberland                    J D             Timberland
J D Timberland / J R R Tolkien    J D / J R R     Timberland / Tolkien

答案 2 :(得分:1)

我为此使用了一些内置函数。一般的想法是使用string_split将字符串分成几行,使用ROW_NUMBER根据长度和字符串中的字符位置保存顺序,然后使用FOR XML PATH()从行连接到单个列。

--Assume your data structure
DECLARE @temp TABLE (thestring varchar(1000))
INSERT INTO @temp VALUES
 ('Hanse J S P'), ('Gerson B D V'), ('J D Timberland')

;WITH CTE AS
(
    SELECT *
        ,ROW_NUMBER() OVER (PARTITION BY thestring ORDER BY thestring, LEN(value) ASC, pos ASC) [order]
        FROM (
                SELECT *      
                    , value AS [theval]
                    , CHARINDEX(CASE WHEN len(value) = 1 THEN ' ' + value ELSE value END, thestring) AS [pos]
                FROM @temp CROSS APPLY string_split(thestring, ' ')
            )  AS dT
)
SELECT ( SELECT value + ' ' AS [text()]
                 FROM cte 
                WHERE cte.thestring = T.thestring
                  AND LEN(theval) = 1
                FOR XML PATH('')
       ) AS [COL A]
      ,( SELECT value + ' ' AS [text()]
                 FROM cte 
                WHERE cte.thestring = T.thestring
                  AND LEN(theval) > 1
                FOR XML PATH('')
       ) AS [COL B]
  FROM @temp T 
GROUP BY thestring

产生输出:

COL A   COL B
-----   -----
B D V   Gerson 
J S P   Hanse 
J D     Timberland 

答案 3 :(得分:1)

此人使用CHARINDEX和递归CTE从名称中提取以空格分隔的子字符串:

  • 在第一个空格之前找到子字符串
  • 将剩余的子字符串输入相同的CTE

一旦有了子字符串,只需将它们粘回去即可。

WITH yourdata(FullName) AS (
    SELECT 'Hanse J S P' UNION
    SELECT 'Gerson B D V' UNION
    SELECT 'J D Timberland' UNION
    SELECT 'TEST 1 TEST 2 TEST 3'
), cte AS (
    SELECT
        FullName,
        CASE WHEN Pos1 = 0 THEN FullName ELSE SUBSTRING(FullName, 1, Pos1 - 1) END AS LeftPart,
        CASE WHEN Pos1 = 0 THEN Null     ELSE SUBSTRING(FullName, Pos1 + 1, Pos2 - Pos1) END AS NextPart,
        1 AS PartSort
    FROM yourdata
    CROSS APPLY (SELECT CHARINDEX(' ', FullName) AS Pos1, LEN(FullName) AS Pos2) AS CA
    UNION ALL
    SELECT
        FullName,
        CASE WHEN Pos1 = 0 THEN NextPart ELSE SUBSTRING(NextPart, 1, Pos1 - 1) END,
        CASE WHEN Pos1 = 0 THEN Null     ELSE SUBSTRING(NextPart, Pos1 + 1, Pos2 - Pos1) END,
        PartSort + 1
    FROM cte
    CROSS APPLY (SELECT CHARINDEX(' ', NextPart) AS Pos1, LEN(NextPart) AS Pos2) AS CA
    WHERE NextPart IS NOT NULL
)
SELECT yourdata.FullName, STUFF(CA1.XMLStr, 1, 1, '') AS Initials, STUFF(CA2.XMLStr, 1, 1, '') AS Names
FROM yourdata
CROSS APPLY (
    SELECT CONCAT(' ', LeftPart)
    FROM cte
    WHERE FullName = yourdata.FullName AND LEN(LeftPart) = 1
    ORDER BY PartSort
    FOR XML PATH('')
) AS CA1(XMLStr)
CROSS APPLY (
    SELECT CONCAT(' ', LeftPart)
    FROM cte
    WHERE FullName = yourdata.FullName AND LEN(LeftPart) > 1
    ORDER BY PartSort
    FOR XML PATH('')
) AS CA2(XMLStr)

结果:

| FullName             | Initials | Names          |
|----------------------|----------|----------------|
| Gerson@B@D@V         | B D V    | Gerson         |
| Hanse@J@S@P          | J S P    | Hanse          |
| J@D@Timberland       | J D      | Timberland     |
| TEST@1@TEST@2@TEST@3 | 1 2 3    | TEST TEST TEST |

答案 4 :(得分:0)

您拥有哪个版本的SQL Server? STRING_SPLIT()是否可用?

如果是,请使用空格作为定界符分割,遍历结果字符串,评估其长度,并在结果字符串为一个字符长度且为字母时将结果字符串与该字符串连接。

除非结果字符串到目前为止为空,否则请在前面添加一个空格。

如果STRING_SPLIT()不可用...那么...这是一些解决方案:

T-SQL split string based on delimiter

-附录

对于问题的第二部分(我最初发布我的答复时最初不存在),您希望将非缩写部分隔离到第二列中,我基本上将两个逻辑块分离为两个结果基于每个元素的长度的字符串。

注意:这在2016年前的SQL Server中不会很优雅,甚至可能需要 CURSOR (叹气)

我知道我会因为提到一个游标而感到沮丧。