我正在将17列CSV文件读入数据库。 偶尔文件有一个"少于17列"行。 我试图忽略该行,但即使所有列都设置为忽略,我也不能忽略该行,并且包失败。
如何忽略这些行?
答案 0 :(得分:4)
您可以通过添加一个Flat File Connection Manager
只添加一个数据类型为DT_WSTR
且长度为4000
的列(假设其名称为Column0
)来执行此操作 - 因此,所有列都被视为一个大列
Dataflow task
之后Script Component
Flat File Source
Column0
中作为输入列并添加17输出列Input0_ProcessInputRow
方法split Column0
中,然后检查数组的长度是否为= 17,然后将值分配给输出列,否则忽略该行。DT_WSTR
,length = 4000
Column0
作为输入列OutputBuffer
SynchronousInput
媒体资源更改为None
Visual Basic
在脚本编辑器中编写以下脚本
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
If Not Row.Column0_IsNull AndAlso
Not String.IsNullOrEmpty(Row.Column0.Trim) Then
Dim strColumns As String() = Row.Column0.Split(CChar(";"))
If strColumns.Length <> 17 Then Exit Sub
Output0Buffer.AddRow()
Output0Buffer.Column = strColumns(0)
Output0Buffer.Column1 = strColumns(1)
Output0Buffer.Column2 = strColumns(2)
Output0Buffer.Column3 = strColumns(3)
Output0Buffer.Column4 = strColumns(4)
Output0Buffer.Column5 = strColumns(5)
Output0Buffer.Column6 = strColumns(6)
Output0Buffer.Column7 = strColumns(7)
Output0Buffer.Column8 = strColumns(8)
Output0Buffer.Column9 = strColumns(9)
Output0Buffer.Column10 = strColumns(10)
Output0Buffer.Column11 = strColumns(11)
Output0Buffer.Column12 = strColumns(12)
Output0Buffer.Column13 = strColumns(13)
Output0Buffer.Column14 = strColumns(14)
Output0Buffer.Column15 = strColumns(15)
Output0Buffer.Column16 = strColumns(16)
End If
End Sub
将输出列映射到目标列
答案 1 :(得分:1)
用于加载CSV的C#解决方案并跳过没有17列的行:
使用脚本组件: 在输入/输出屏幕上添加所有输出数据类型。
string fName = @"C:\test.csv" // Full file path: it should reference via variable
string[] lines = System.IO.File.ReadAllLines(fName);
//add a counter
int ctr = 1;
foreach(string line in lines)
{
string[] cols = line.Split(',');
if(ctr!=1) //Assumes Header row. elim if 1st row has data
{
if(cols.Length == 17)
{
//Write out to Output
Output0Buffer.AddRow();
Output0Buffer.Col1 = cols[0].ToString(); //You need to cast to data type
Output0Buffer.Col2 = int.Parse(cols[1]) // example to cast to int
Output0Buffer.Col3 = DateTime.Parse(cols[2]) // example of datetime
... //rest of Columns
}
//optional else to handle skipped lines
//else
// write out line somewhere
}
ctr++; //increment counter
}
答案 2 :(得分:1)
这是我其他答案中的@SidC评论。
这使您可以处理多个文件:
//set up variables
string line;
int ctr = 0;
string[] files = System.IO.Directory.GetFiles(@"c:/path", "filenames*.txt");
foreach(string file in files)
{
var str = new System.IO.StreamReader(file);
while((line = str.ReadLine()) != null)
{
// Work with line here similar to the other answer
}
}