如何为特定值的每次出现创建新行

时间:2016-08-23 12:11:02

标签: sql teradata

我需要一些解决此问题的指导。我有一个类似于以下数据集的

Record_type         Record_Text
H01                 ABCDEFGHI123456789
D45                 BCDEFGH098765
D20                 BABRTHYUHU56789
D30                 QWERTY09876558471255
D12                 ASDFGHJ9814752
H02                 UGHRYCGDF12304025
G80                 YHNBGTRFV0147852
H01                 MLOPKIJUHNB624817
D20                 PLKIJUNHMY7653235
H15                 MVNBDGETDGSTEX9874
D30                 GNHGDTBFJVNV834687
H02                 JDGHKDGHSDFIG7845387
D60                 GHCNDBDGCTEF45367

每次出现H01都会启动一项新交易。根据上面的例子,有两个交易(H01到G80和H01到D60)。

我需要根据每个事务的某些条件从RECORD_TEXT字段中选择某些字符。 我使用下面的代码

单独尝试了第一笔交易
SELECT
 ( MAX(CASE WHEN RECORD_TYPE = 'H01' THEN (SUBSTR(RECORD_TEXT,1,10))  END)
|| MAX(CASE WHEN  RECORD_TYPE = 'D20' THEN ',' || (SUBSTR(RECORD_TEXT,2,3))     END)
|| MAX(CASE WHEN  RECORD_TYPE = 'D30' THEN ',' ||  (SUBSTR(RECORD_TEXT,9,8)) END)
|| MAX(CASE WHEN  RECORD_TYPE = 'H02' THEN ',' ||    (SUBSTR(RECORD_TEXT,13,4)) END)) AS TOTAL_FIELD
FROM   TABLE  

我得到了预期的输出。

ABCDEFGHI1,ABR,87655847,0402

但我无法继续进行后续交易。

上述示例的预期输出(两个事务=两行)将是

ABCDEFGHI1,ABR,87655847,0402
MLOPKIJUHN,LKI,JVNV8346,G784

总共有大约200笔交易。我正在使用Teradata版本14.请帮助。

3 个答案:

答案 0 :(得分:0)

这是分析功能的工作,我不熟悉teradata,但应该与其他地方提供的相似

查找分区,查看此链接

http://www.tutorialspoint.com/teradata/teradata_partitioned_primary_index.htm

您基本上可以按照自己的方式对数据进行切片 所以你会做这样的事情

PARTITION BY Record_type

除非按照Tyron78的建议添加列,否则您还必须使用其他分析函数来创建算法以确定属于此集的记录之间的内容。

希望这有帮助

答案 1 :(得分:0)

DECLARE @t table(
  CREATE_TMSP int, Record_type nvarchar(20), Record_Text nvarchar(50)
);

INSERT INTO @t VALUES(1,'H01','ABCDEFGHI123456789');
INSERT INTO @t VALUES(2,'D45','BCDEFGH098765');
INSERT INTO @t VALUES(3,'D20','BABRTHYUHU56789');
INSERT INTO @t VALUES(4,'D30','QWERTY09876558471255');
INSERT INTO @t VALUES(5,'D12','ASDFGHJ9814752');
INSERT INTO @t VALUES(6,'H02','UGHRYCGDF12304025');
INSERT INTO @t VALUES(7,'G80','YHNBGTRFV0147852');
INSERT INTO @t VALUES(8,'H01','MLOPKIJUHNB624817');
INSERT INTO @t VALUES(9,'D20','PLKIJUNHMY7653235');
INSERT INTO @t VALUES(10,'H15','MVNBDGETDGSTEX9874');
INSERT INTO @t VALUES(11,'D30','GNHGDTBFJVNV834687');
INSERT INTO @t VALUES(12,'H02','JDGHKDGHSDFIG7845387');
INSERT INTO @t VALUES(13,'D60','GHCNDBDGCTEF45367');

WITH cte AS(
  SELECT RECORD_TYPE, RECORD_TEXT, DENSE_RANK() OVER(ORDER BY CREATE_TMSP) AS DERIVED_COLUMN
    FROM @t
),
cteLead AS(
  SELECT Record_Type, Record_Text, DERIVED_COLUMN AS DERIVED_COLUMN_LEFT, ISNULL(LEAD(DERIVED_COLUMN) OVER (ORDER BY DERIVED_COLUMN), 999999) AS DERIVED_COLUMN_RIGHT
   FROM cte
   WHERE Record_type = 'H01'
),
cteSplit AS(
SELECT a.DERIVED_COLUMN_LEFT AS ID, a.Record_Type AS RecordTypeHead, a.Record_Text AS RecordTextHead, a.DERIVED_COLUMN_LEFT, a.DERIVED_COLUMN_RIGHT,
       b.Record_Type,
       CASE
         WHEN b.Record_type = 'H01' THEN SUBSTRING(b.RECORD_TEXT,1,10)
         WHEN b.Record_type = 'D20' THEN SUBSTRING(b.RECORD_TEXT,2,3)
         WHEN b.Record_type = 'D30' THEN SUBSTRING(b.RECORD_TEXT,9,8)
         WHEN b.Record_type = 'H02' THEN SUBSTRING(b.RECORD_TEXT,13,4)
       END AS RecordTextSplit
 FROM cteLead AS a
 JOIN cte AS b ON b.DERIVED_COLUMN >= a.DERIVED_COLUMN_LEFT AND b.DERIVED_COLUMN < a.DERIVED_COLUMN_RIGHT
 WHERE b.Record_type IN ('H01', 'D20', 'D30', 'H02')
 )
 --
 SELECT * FROM cteSplit
 PIVOT
 (
    MAX(RecordTextSplit)
    FOR Record_Type IN (H01, D20, D30, H02)
 ) AS pvt

答案 2 :(得分:0)

添加时间戳列后,可以轻松为每个事务分配唯一编号。然后您可以应用现有的计算:

SELECT
   trans#,
      MAX(CASE WHEN RECORD_TYPE = 'H01' THEN        (SUBSTR(RECORD_TEXT, 1,10)) END)
   || MAX(CASE WHEN RECORD_TYPE = 'D20' THEN ',' || (SUBSTR(RECORD_TEXT, 2, 3)) END)
   || MAX(CASE WHEN RECORD_TYPE = 'D30' THEN ',' || (SUBSTR(RECORD_TEXT, 9, 8)) END)
   || MAX(CASE WHEN RECORD_TYPE = 'H02' THEN ',' || (SUBSTR(RECORD_TEXT,13, 4)) END)) AS TOTAL_FIELD
FROM 
 (
  SELECT CREATE_TMSP,RECORD_TYPE, RECORD_TEXT, 
     -- assign a unique number to each transaction
     SUM(CASE WHEN Record_type = 'H01' THEN 1 ELSE 0 END) 
     OVER (ORDER BY CREATE_TMSP
           ROWS UNBOUNDED PRECEDING) AS trans#
  FROM table
  -- more efficient to filter unneeded data before the OLAP function
  WHERE RECORD_TYPE IN ('H01','D20','D30','H02')
  -- uncomment if the data doesn't start with an 'H01' row and you don't want partial transactions
  -- QUALIFY trans# > 0
 ) AS dt
GROUP BY trans#