SSIS合并变量列

时间:2017-08-18 10:54:22

标签: sql-server ssis etl

使用SSIS,我正在导入一个.txt文件,这在大多数情况下是直接的。

正在导入的文件有一定数量的列到一个点,但有一个自由文本/注释字段,可以重复到未知长度,类似于下面。

   "000001","J Smith","Red","Free text here"
   "000002","A Ball","Blue","Free text here","but can","continue"
   "000003","W White","Green","Free text here","but can","continue","indefinitely"
   "000004","J Roley","Red","Free text here"

我理想的做法(在SSIS中)是将前三列保持为单列,但将任何自由文本列合并为一列。即合并/连接在“颜色”之后出现的任何内容。列。

因此,当我将其加载到SSMS表中时,它看起来像:

000001 | J Smith | Red   | Free text here                                     |
000002 | A Ball  | Blue  | Free text here but can continue                    |
000003 | W White | Green | Free text here but can continue indefinitely       |
000004 | J Roley | Red   | Free text here                                     |

2 个答案:

答案 0 :(得分:0)

我没有看到任何简单的解决方案。您可以尝试以下内容:

<强> 1。将完整的原始数据加载到临时表(没有任何分隔符):

<强>步骤:

  1. 在“执行SQL任务”
  2. 中创建临时表
  3. 使用平面文件源(使用Ragged Right格式)和
  4. 创建数据流任务
  5. OLEDB目的地(usint #temp table在上一个任务中创建)
  6. 设置连接管理器的delayValidation=True和DFT
  7. 为连接管理器设置retainSameConnection=True
  8. 参考this创建临时表并使用它。

    <强> 2。创建T-SQL以分隔3列(如下所示)

    with col1 as (
      Select 
      [Val],
      substring([Val], 1 ,charindex(',', [Val]) - 1) col1,
      len(substring([Val], 1 ,charindex(',', [Val]))) + 1 col1Len
      from #temp
    ), col2 as (
      select 
      [Val],
      col1,
      substring([Val], col1Len, charindex(',', [Val], col1Len) - col1Len) as col2,
      charindex(',', [Val], col1Len) + 1 col2Len
       from col1
    ) select col1, col2, substring([Val], col2Len, 200) as col3
    from col2
    

    T-SQL输出:

    col1    col2    col3
    "000001"    "J Smith"   "Red","Free text here"
    "000002"    "A Ball"    "Blue","Free text here","but can","continue"
    "000003"    "W White"   "Green","Free text here","but can","continue","indefinitely"
    

    第3。在不同的数据流任务中使用OLEDB源中的上述查询

    根据您的要求替换双引号(“)。

答案 1 :(得分:0)

这是一项有趣的练习:

添加数据流

添加脚本组件(选择源)

将4列添加到输出ID,名称颜色,FreeText所有类型字符串

编辑脚本:

将以下命名空间粘贴到顶部:

using System.Text.RegularExpressions;
using System.Linq;

将以下代码粘贴到CreateNewOutputRows:

    string strPath = @"a:\test.txt";  \\put your file path in here
    var lines = System.IO.File.ReadAllLines(strPath);

    foreach (string line in lines)
    {
        //Code I stole to read CSV
        string delimeter = ",";

        Regex rgx = new Regex(String.Format("(\"[^\"]*\"|[^{0}])+", delimeter));
        var cols = rgx.Matches(line)
                      .Cast<Match>()
                      .Select(m => m.Value.Trim().Trim('"'))
                      .Where(v => !string.IsNullOrWhiteSpace(v));
        //create a column counter
        int ctr = 0;

        Output0Buffer.AddRow();

        //Preset FreeText to empty string
        string FreeTextBuilder = String.Empty;

        foreach( string col in cols)
        {

            switch (ctr)
            {
                case 0: 
                    Output0Buffer.ID = col; 
                    break;
                case 1:
                    Output0Buffer.Name = col;
                    break;
                case 2:
                    Output0Buffer.Color = col;
                    break;
                default:
                    FreeTextBuilder += col + " ";
                    break;
            }
            ctr++;
        }

        Output0Buffer.FreeText = FreeTextBuilder.Trim();

    }