是否可以在Teradata中的字符串中对字符串进行分组?

时间:2013-11-22 22:21:24

标签: sql teradata

原始表格(正是我正在使用的表格......所有逗号括号等)

id     attributes
1      123(red), 139(red), 123(white), 123(black), 139(white),
2      123(black), 139(white), 123(green),
32     223(blue), 223(red), 553(white), 123(black),
4      323(white), 139(red), 
23     523(red),

我需要对属性编号进行分组,以便我的表格看起来像

id     attributes
1      123(red, white, black); 139(red, white);
2      123(black, green); 139(white);
32     223(blue, red); 553(white); 123(black);
4      323(white); 139(red);
23     523(red);

我该怎么做?

不幸的是,我无权访问oreplace .. translate等存储过程和函数。我以前曾经和Oracle打过交道,这是一个简单的任务,因为一个人可以访问存储过程...这里我不知道该怎么做

1 个答案:

答案 0 :(得分:10)

SQL绝对不是像这样进行字符串处理的正确语言: - )

我使用现有代码来分割/创建逗号分隔的字符串,但在TD14中它会更容易(有strtok_split_to_table和udfConcat)。

CREATE VOLATILE TABLE vt (id INT, attrib VARCHAR(100)) ON COMMIT PRESERVE ROWS;

INSERT INTO vt(1      ,'123(red), 139(red), 123(white), 123(black), 139(white),');
INSERT INTO vt(2      ,'123(black), 139(white), 123(green),');
INSERT INTO vt(32     ,'223(blue), 223(red), 553(white), 123(black),');
INSERT INTO vt(4      ,'323(white), 139(red), ');
INSERT INTO vt(23     ,'523(red),');

WITH RECURSIVE cte
 (id,
  len,
  remaining,
  word,
  pos
 ) AS (
  SELECT
    id,
    POSITION(',' IN attrib || ',') - 1 AS len,
    SUBSTRING(attrib || ',' FROM len + 2) AS remaining,
    TRIM(SUBSTRING(attrib FROM 1 FOR len)) AS word,
    1
  FROM vt
  UNION ALL
  SELECT
    id,
    POSITION(',' IN remaining)- 1 AS len_new,
    SUBSTRING(remaining FROM len_new + 2),
    TRIM(SUBSTRING(remaining FROM 1 FOR len_new)),
    pos + 1
  FROM cte
  WHERE remaining <> ''
 )
SELECT
  id,
     MAX(CASE WHEN newpos = 1 THEN newgrp ELSE '' END) ||
     MAX(CASE WHEN newpos = 2 THEN newgrp ELSE '' END) ||
     MAX(CASE WHEN newpos = 3 THEN newgrp ELSE '' END) ||
     MAX(CASE WHEN newpos = 4 THEN newgrp ELSE '' END) ||
     MAX(CASE WHEN newpos = 5 THEN newgrp ELSE '' END) ||
     MAX(CASE WHEN newpos = 6 THEN newgrp ELSE '' END)
     -- add as many CASEs as needed
FROM
 ( 
   SELECT 
     id, 
     ROW_NUMBER() 
     OVER (PARTITION BY id
           ORDER BY newgrp) AS newpos,
     a ||
     MAX(CASE WHEN pos = 1 THEN '('  || b ELSE '' END) ||
     MAX(CASE WHEN pos = 2 THEN ', ' || b ELSE '' END) ||
     MAX(CASE WHEN pos = 3 THEN ', ' || b ELSE '' END) ||
     MAX(CASE WHEN pos = 4 THEN ', ' || b ELSE '' END) ||
     MAX(CASE WHEN pos = 5 THEN ', ' || b ELSE '' END) ||
     MAX(CASE WHEN pos = 6 THEN ', ' || b ELSE '' END)
     -- add as many CASEs as needed
     || '); ' AS newgrp
   FROM 
    (
      SELECT
        id,
        ROW_NUMBER() 
        OVER (PARTITION BY id, a
              ORDER BY pos) AS pos,
        SUBSTRING(word FROM 1 FOR POSITION('(' IN word) - 1) AS a,
        TRIM(TRAILING ')' FROM SUBSTRING(word FROM POSITION('(' IN word) + 1)) AS b
      FROM cte
      WHERE word <> ''
    ) AS dt
   GROUP BY id, a
 ) AS dt
GROUP BY id;