我有一个我每天收到的excel文件。该文件中的列数不是特定的。我的要求是通过SSIS加载我表中的最后一列。我如何能够动态识别上次使用的列?
答案 0 :(得分:3)
您可以使用c#script:
确保添加Using System.Data.OleDb;到命名空间区域 并添加输出列LastCol并选择数据类型。
public override void CreateNewOutputRows()
{
/*
Add rows by calling the AddRow method on the member variable named "<Output Name>Buffer".
For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput".
*/
string fileName = @"C:\test.xlsx";
string SheetName = "Sheet1";
string cstr = "Provider.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 12.0;HDR=No;IMEX=1\"";
OleDbConnection xlConn = new OleDbConnection(cstr);
xlConn.Open();
OleDbCommand xlCmd = xlConn.CreateCommand();
xlCmd.CommandText = "Select * from [" + SheetName + "]";
xlCmd.CommandType = CommandType.Text;
OleDbDataReader rdr = xlCmd.ExecuteReader();
int rowCt = 0; //Counter
while (rdr.Read())
{
//skip headers
if (rowCt != 0)
{
int maxCol = rdr.FieldCount;
Output0Buffer.AddRow();
Output0Buffer.LastCol = (int)rdr[maxCol];
}
rowCt++; //increment counter
}
}
答案 1 :(得分:2)
使用脚本任务:
使用以下函数将索引转换为列字母(例如:1 - &gt; A)
Sheet1
构建只读最后一列的SQL命令
这个答案假设工作表名称是VB.Net
,使用的编程语言是@[User::strQuery]
@[User::ExcelFilePath]
作为ReadWrite变量,System.Data.OleDb
作为ReadOnly Variable (在脚本任务窗口中) 注意:您必须导入 m_strExcelPath = Dts.Variables.Item("ExcelFilePath").Value.ToString
Dim strSheetname As String = String.Empty
Dim intLastColumn As Integer = 0
m_strExcelConnectionString = Me.BuildConnectionString()
Try
Using OleDBCon As New OleDbConnection(m_strExcelConnectionString)
If OleDBCon.State <> ConnectionState.Open Then
OleDBCon.Open()
End If
'Get all WorkSheets
m_dtschemaTable = OleDBCon.GetOleDbSchemaTable(OleDbSchemaGuid.Tables,
New Object() {Nothing, Nothing, Nothing, "TABLE"})
'Loop over work sheet to get the first one (the excel may contains temporary sheets or deleted ones
For Each schRow As DataRow In m_dtschemaTable.Rows
strSheetname = schRow("TABLE_NAME").ToString
If Not strSheetname.EndsWith("_") AndAlso strSheetname.EndsWith("$") Then
Using cmd As New OleDbCommand("SELECT * FROM [" & strSheetname & "]", OleDBCon)
Dim dtTable As New DataTable("Table1")
cmd.CommandType = CommandType.Text
Using daGetDataFromSheet As New OleDbDataAdapter(cmd)
daGetDataFromSheet.Fill(dtTable)
End Using
'Get the last Column Index
intLastColumn = dtTable.Columns.Count
End Using
'when the first correct sheet is found there is no need to check others
Exit For
End If
Next
OleDBCon.Close()
End Using
Catch ex As Exception
Throw New Exception(ex.Message, ex)
End Try
Dim strColumnname as String = GetExcelColumnName(intLastColumn)
Dts.Variables.Item("strQuery").Value = "SELECT * FROM [" & strSheetname & strColumnname & ":" & strColumnname & "]"
Dts.TaskResult = ScriptResults.Success
End Sub
Private Function GetExcelColumnName(columnNumber As Integer) As String
Dim dividend As Integer = columnNumber
Dim columnName As String = String.Empty
Dim modulo As Integer
While dividend > 0
modulo = (dividend - 1) Mod 26
columnName = Convert.ToChar(65 + modulo).ToString() & columnName
dividend = CInt((dividend - modulo) / 26)
End While
Return columnName
End Function
Select * from [Sheet1$]
@[User::strQuery]
分配给变量@[User::strQuery]
Delay Validation
True
属性设置为{{1}} 答案 2 :(得分:0)
不,你不能这样做。列数和数据类型必须事先确定,不能更改。否则SSIS将失败。所以无法动态获取最后一列。解决方法是使用某个宏从excel中获取最后一列,然后将其用作SSIS的源。