将两个不同的csv文件(数据)合并到SSIS

时间:2015-09-15 21:36:49

标签: sql sql-server tsql ssis

我的情况就在这里:我创建了下表:

CREATE TABLE temp1
(
ProcessName varchar(50),
ProcessNo  varchar(50),
ProcessPages varchar(50),
ProcessClass varchar (50)
)

INSERT INTO temp1 VALUES ('PRO1','PR012','5','O');
INSERT INTO temp1 VALUES ('PRO1','PR012','4','Y');
INSERT INTO temp1 VALUES ('PRO1','PR012','10','Y');
INSERT INTO temp1 VALUES ('PRO1','PR012','7','Y');
INSERT INTO temp1 VALUES ('PRO1','PR012','6','Y');
INSERT INTO temp1 VALUES ('PRO1','PR012','14','K');
INSERT INTO temp1 VALUES ('PRO1','PR012','23','Y');
INSERT INTO temp1 VALUES ('PRO1','PR012','45','L');
INSERT INTO temp1 VALUES ('PRO1','PR012','52','Y');
INSERT INTO temp1 VALUES ('PRO2','PR022','3','K');
INSERT INTO temp1 VALUES ('PRO2','PR022','5','T');
INSERT INTO temp1 VALUES ('PRO2','PR022','6','Y');
INSERT INTO temp1 VALUES ('PRO2','PR022','5','Y');
INSERT INTO temp1 VALUES ('PRO2','PR022','5','H');
INSERT INTO temp1 VALUES ('PRO2','PR022','8','Y');
INSERT INTO temp1 VALUES ('PRO2','PR022','5','T');
INSERT INTO temp1 VALUES ('PRO2','PR022','2','Y');
INSERT INTO temp1 VALUES ('PRO2','PR022','3','T');
INSERT INTO temp1 VALUES ('PRO2','PR022','3','Y');
INSERT INTO temp1 VALUES ('PRO3','PR032','5','Y');
INSERT INTO temp1 VALUES ('PRO3','PR032','10','Y');
INSERT INTO temp1 VALUES ('PRO3','PR032','15','Y');
INSERT INTO temp1 VALUES ('PRO3','PR032','25','Y');
INSERT INTO temp1 VALUES ('PRO3','PR032','35','Y');
INSERT INTO temp1 VALUES ('PRO3','PR032','45','Y');
INSERT INTO temp1 VALUES ('PRO3','PR032','55','Y');
INSERT INTO temp1 VALUES ('PRO3','PR032','25','Y');
INSERT INTO temp1 VALUES ('PRO3','PR032','25','Y');
INSERT INTO temp1 VALUES ('PRO3','PR032','20','Y');
INSERT INTO temp1 VALUES ('PRO3','PR032','3','O');
INSERT INTO temp1 VALUES ('PRO3','PR032','3','K');

以下是我在SSIS中尝试做的事情:我正在尝试合并这两个数据源

   ----First Data Source
    DECLARE @ColumnNames NVARCHAR(MAX)
    DECLARE @SQL NVARCHAR(MAX)

    SELECT @ColumnNames=Stuff((SELECT DISTINCT ',' + Quotename(ProcessName)
                       FROM [temp1]
                       FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
    SET @SQL= '
    SELECT * FROM (
        SELECT [ProcessName], 
            COUNT(CASE WHEN ProcessPages >=  1 and ProcessPages <=  5 THEN ''1p'' END) AS [1p],
            COUNT(CASE WHEN ProcessPages >=  6 and ProcessPages <= 10 THEN ''2p'' END) AS [2p],
            COUNT(CASE WHEN ProcessPages >= 11 and ProcessPages <= 16 THEN ''3p'' END) AS [3p],
            COUNT(CASE WHEN ProcessPages >= 17 and ProcessPages <= 50 THEN ''4p'' END) AS [4p],
            COUNT(CASE WHEN ProcessPages >  50 THEN ''5p'' END) AS [5p],
            COUNT([ProcessName]) AS Total
        FROM temp1
        GROUP BY [ProcessName]) AS SOURCE
        UNPIVOT ( val FOR Process_No_In_Cases IN ([1p],[2p],[3p],[4p],[5p],[Total]) ) U
        PIVOT ( MAX(val) FOR ProcessName IN('+@ColumnNames +') ) As PivotTable'
    EXECUTE sp_executesql @SQL


    ------Second Data Source
    DECLARE @ColumnNames2 NVARCHAR(MAX)
    DECLARE @SQL2 NVARCHAR(MAX)
    SELECT @ColumnNames2=Stuff((SELECT DISTINCT ',' + Quotename([ProcessName])
                       FROM temp1
                       FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
    SET @SQL2= '
    SELECT * FROM (
        SELECT [ProcessName], 
            COUNT(CASE WHEN [ProcessClass]=''Y'' THEN ''Coordinated'' END) AS [Coordinated],
            COUNT([ProcessName]) AS Total
        FROM temp1
        GROUP BY [ProcessName]) AS SOURCE
        UNPIVOT ( val FOR [''] IN ([Coordinated],[Total]) ) U
        PIVOT ( MAX(val) FOR [ProcessName] IN('+@ColumnNames2 +') ) As PivotTable'

    --print @sql
EXECUTE sp_executesql @SQL2

将输出组合成一个csv文件,如下所示:

Process_No_In_Cases PRO1    PRO2    PRO3
1p                   2       8        3
2p                   3       2        1
3p                   1       0        1
4p                   2       0        6
5p                   1       0        1
Total                9       10       12


Coordinated          6       5        10
Total                9       10       12

任何输入?

--------这是一个更新的:

使用相同的表temp1:

我有两个问题:

SELECT  *
 FROM (
    SELECT [ProcessName], 
        COUNT(CASE WHEN [ProcessPages] >=  1 and [ProcessPages] <=  5 THEN '1p' END) AS [1p],
        COUNT(CASE WHEN [ProcessPages] >=  6 and [ProcessPages] <= 10 THEN '2p' END) AS [2p],
        COUNT(CASE WHEN [ProcessPages] >= 11 and [ProcessPages] <= 16 THEN '3p' END) AS [3p],
        COUNT(CASE WHEN [ProcessPages] >= 17 and [ProcessPages] <= 50 THEN '4p' END) AS [4p],
        COUNT(CASE WHEN [ProcessPages] >  50 THEN '5p' END) AS [5p],
        COUNT([ProcessName]) AS Total
    FROM temp1
    GROUP BY [ProcessName]) AS SOURCE
    UNPIVOT ( val FOR [Pro_No_In_Cases] IN ([1p],[2p],[3p],[4p],[5p],[Total]) ) U
    PIVOT ( MAX(val) FOR [ProcessName] IN ([PRO1],[PRO2],[PRO3] 
            ) ) As PivotTable

- 和

WITH CTE_Temp AS (
    SELECT 
       [ProcessName] =    
       CASE 
          WHEN [ProcessPages] >=  1 AND [ProcessPages] <=  5 THEN '1p'
          WHEN [ProcessPages] >=  6 AND [ProcessPages] <= 10 THEN '2p'
          WHEN [ProcessPages] >= 11 AND [ProcessPages] <= 16 THEN '3p'
          WHEN [ProcessPages] >= 17 AND [ProcessPages] <= 50 THEN '4p'
          WHEN [ProcessPages] >  50   THEN '5p'
       END
       , [ProcessClass]    
    FROM temp1
)
SELECT
    [ProcessName] = CASE WHEN GROUPING([ProcessName]) = 0 THEN [ProcessName] ELSE 'Total' END 
    , Coordinated     = COUNT(CASE WHEN [ProcessClass]  =  'Y' THEN [ProcessClass]  END)
    , Uncordinated = COUNT(CASE WHEN [ProcessClass]  <> 'Y' THEN [ProcessClass]  END)
FROM CTE_Temp
GROUP BY GROUPING SETS ([ProcessName], ())

使用SSIS,我将它们放在两个独立的oledb源中并使用了一个联盟,但它仍然可以工作。如何将这两个源合并到一个csv(文件)上,并在它们之间留有空格。请看我的新例子:

Pro_No_In_cases PRO1    PRO2    PRO3    
1p               2       8       3  
2p               3       2       1  
3p               1       0       1  
4p               2       0       6  
5p               1       0       1  
Total            9       10      12 



 ProcessName    Coordinated Uncordinated    
   1p               5          8    
   2p               6          0    
   3p               1          1    
   4p               7          1    
   5p               2          0    
   Total            21         10

任何输入?

1 个答案:

答案 0 :(得分:0)

@Solution
使用SQL,您可以尝试在2个数据集和写入之间添加1-2-3空记录。查询1联盟空记录联合查询2


将第一个查询代码放入OLEDB SOURCE中,从查询中选择数据
将第二个查询代码放入另一个OLEDB SOURCE中,从查询中选择数据。



现在使用联盟,并将OLEDB连接到联盟。
见这,http://social.technet.microsoft.com/wiki/contents/articles/7257.ssis-union-all-transformation.aspx


现在你有一个数据源