我需要在SSIS中将行旋转到列。 我在Microsoft Visual Studio 2010版中使用Integration Services。
我有一个包含以下信息的平面文件:
column 0 column1 column2
-------------------------------------
d-5454-s34 name Frans
d-5454-s34 sd xyh
d-5454-s34 description Group zen
d-5454-s34 member xxxx
d-5454-s34 member yyyy
d-5454-s34 member zzzzz
d-5454-s34 member uuuuu
d-5454-s45 name He-man
d-5454-s45 sd ygh
d-5454-s45 description Group Comics
d-5454-s45 member eeee
d-5454-s45 member ffffff
e-3434-t45 name Calvin
e-3434-t45 sd trdg
,最终输出应为
id name sd description member
---------------------------------------------------------------------------
d-5454-s34 Frans xyh Group zen xxxx; yyyy; zzzzz; uuuuu
d-5454-s45 He-man ygh Group Comics eeee; ffffff
e-3434-t45 Calvin trdg NULL NULL
我使用了平面文件组件,结果与最终输出之前的结果相同(请参见上面的检查)。
如果我在SSIS中设置枢轴组件,如下所示: 我将 PIVOT KEY 设置为第1列(它包含行Name,sd,description和member-这是最后一行重复...。),将 SET KEY 设置为第0列。因为我们拥有不应重复的ID。 :),最后将 pivot值作为第2列。之后,我将数据透视表输出列设置为C_NAME,C_sd,C_description,C_member ...,但是由于成员在多行中重复,因此抛出此错误。 ..重复的键值“成员” ...该如何克服?
仅测试一下,我删除了所有剩余的成员,只剩下一个成员,因此可以正常工作。现在,我需要一种方法来聚合具有重复的MEMBER(第0列)的几行。如何使用SSIS的聚合函数仅对第1列中的成员进行分组,并连接第2列中成员的所有不同值,并用分隔;如上表所示。谢谢。
[
答案 0 :(得分:1)
在实际执行aggregate
操作之前,您需要稍微改变方法并转换(pivot
)数据。
构建了一个示例程序包来演示解决方案-
根据数据包,首先需要对数据进行排序,因为作业将相互比较记录。接下来,我们需要一个script component
(类型transformation
)。选择所有必需的input
并创建必要的output
列。只要确保增加最后一列(column3)的大小,输出列的数据类型将与输入相同。另外,请确保script component
是asynchronous
,因为它抛出的行数与传入的行数不同。
在脚本组件中使用以下代码,该代码将检查上一行的值并将数据附加为以分号分隔的相关记录列表。
bool initialRow = true; // Indicater for the first row
string column0 = "";
string column1 = "";
string column2 = "";
public override void Input0_ProcessInput(Input0Buffer Buffer)
{
// Loop through buffer
while (Buffer.NextRow())
{
// Process an input row
Input0_ProcessInputRow(Buffer);
// Change the indicator after the first row has been processed
initialRow = false;
}
// Check if this is the last row
if (Buffer.EndOfRowset())
{
// Fill the columns of the existing output row with values
// from the variable before closing this Script Component
Output0Buffer.Column0 = column0;
Output0Buffer.Column1 = column1;
Output0Buffer.Column2 = column2;
}
}
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
if (initialRow)
{
// This is for the first input row only
// Create a new output row
Output0Buffer.AddRow();
// Now fill the variables with the values from the input row
column0 = Row.column0;
column1 = Row.column1;
column2 = Row.column2;
}
else if ((!initialRow) & ((column0 != Row.column0) || (column1 != Row.column1)))
{
// This isn't the first row, but either the column1 or column2 did change
// Fill the columns of the existing output row with values
// from the variable before creating a new output row
Output0Buffer.Column0 = column0;
Output0Buffer.Column1 = column1;
Output0Buffer.Column2 = column2;
// Create a new output row
Output0Buffer.AddRow();
// Now fill the variables with the values from the input row
column0 = Row.column0;
column1 = Row.column1;
column2 = Row.column2;
}
else if ((!initialRow) & (column0 == Row.column0) & (column1 == Row.column1) & (column1 == "member"))
{
// This isn't the first row, and the column (member) did not change
// Concatenate the studentsname to the variable
column2 += ";" + Row.column2;
}
}
参考:link
答案 1 :(得分:1)
SSIS提供了很多转换,但是大多数时候,将数据插入到临时表中并编写简单的查询可以节省大量时间,性能可能会更好。
例如:
with #tempTable as (
select 'd-5454-s34' column0, 'name' column1, 'Frans' column2
union all select 'd-5454-s34', 'sd ', 'xyh'
union all select 'd-5454-s34', 'description', 'Group zen'
union all select 'd-5454-s34', 'member', 'xxxx'
union all select 'd-5454-s34', 'member', 'yyyy'
union all select 'd-5454-s34', 'member', 'zzzzz'
union all select 'd-5454-s34', 'member', 'uuuuu'
union all select 'd-5454-s45', 'name', 'He-man'
union all select 'd-5454-s45', 'sd', 'ygh '
union all select 'd-5454-s45', 'description', 'Group Comics'
union all select 'd-5454-s45', 'member', 'eeee'
union all select 'd-5454-s45', 'member', 'ffffff'
union all select 'e-3434-t45', 'name', 'Calvin'
union all select 'e-3434-t45', 'sd', 'trdg'
)
SELECT column0
, [name]
, sd
, description
, member
FROM ( SELECT column0,column1, column2 , STUFF(( SELECT '; ' + column2
FROM #tempTable T1
WHERE T1.column0 = t2.column0
AND column1 = 'member'
FOR XML PATH('') ),1, 1, '') member
FROM #tempTable t2 ) t
PIVOT ( MAX(t.column2) FOR t.column1 IN ([name], sd, description)) AS pivotable