我正在尝试使用SSIS将Csv文件导入SQL SERVER
以下是数据如何显示的示例
Student_Name,Student_DOB,Student_ID,Student_Notes,Student_Gender,Student_Mother_Name
Joseph Jade,2005-01-01,1,Good listener,Male,Amy
Amy Jade,2006-01-01,1,Good in science,Female,Amy
....
Csv列不包含文本限定符(引用)
我使用SSIS创建了一个简单的包,将其导入SQL,但有时SQL中的数据如下所示
Student_Name Student_DOB Student_ID Student_Notes Student_Gender Student_Mother_Name
Ali Jade 2004-01-01 1 Good listener Bad in science Male,Lisa
原因是somtimes [Student_Notes]列包含用作列分隔符的逗号(,),因此行未正确导入
任何建议
答案 0 :(得分:1)
警告:我不是普通的C#编码员。
但无论如何,此代码执行以下操作:
它打开一个名为C:\ Input.TXT
的文件搜索每一行。如果该行有超过5个逗号,则它会从第三个字段中删除所有额外的逗号(注释)
它将结果写入C:\ Output.TXT - 这是您实际导入的那个
可以做出许多改进:
请记住,您的包需要对相应文件夹的写访问权
public void Main()
{
// Search the file and remove extra commas from the third last field
// Extended from code at
// http://stackoverflow.com/questions/1915632/open-a-file-and-replace-strings-in-c-sharp
// Nick McDermaid
string sInputLine;
string sOutputLine;
string sDelimiter = ",";
String[] sData;
int iIndex;
// open the file for read
using (System.IO.FileStream inputStream = File.OpenRead("C:\\Input.txt"))
{
using (StreamReader inputReader = new StreamReader(inputStream))
{
// open the output file
using (StreamWriter outputWriter = File.AppendText("C:\\Output.txt"))
{
// Read each line
while (null != (sInputLine = inputReader.ReadLine()))
{
// Grab each field out
sData = sInputLine.Split(sDelimiter[0]);
if (sData.Length <= 6)
{
// 6 or less fields - just echo it out
sOutputLine = sInputLine;
}
else
{
// line has more than 6 pieces
// We assume all of the extra commas are in the notes field
// Put the first three fields together
sOutputLine =
sData[0] + sDelimiter +
sData[1] + sDelimiter +
sData[2] + sDelimiter;
// Put the middle notes fields together, excluding the delimiter
for (iIndex=3; iIndex <= sData.Length - 3; iIndex++)
{
sOutputLine = sOutputLine + sData[iIndex] + " ";
}
// Tack on the last two fields
sOutputLine = sOutputLine +
sDelimiter + sData[sData.Length - 2] +
sDelimiter + sData[sData.Length - 1];
}
// We've evaulted the correct line now write it out
outputWriter.WriteLine(sOutputLine);
}
}
}
}
Dts.TaskResult = (int)Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success;
}
答案 1 :(得分:1)
在平面文件连接管理器中。将文件设为仅一列(DT_STR 8000)
只需在dataflowtask中添加一个脚本Component并添加输出列(与所示示例相同)
脚本组件中的使用以下代码拆分每一行:
\\Student_Name,Student_DOB,Student_ID,Student_Notes,Student_Gender,Student_Mother_Name
Dim strCells() as string = Row.Column0.Split(CChar(","))
Row.StudentName = strCells(0)
Row.StudentDOB = strCells(1)
Row.StudentID = strCells(2)
Row.StudentMother = strCells(strCells.Length - 1)
Row.StudentGender = strCells(strCells.Length - 2)
Dim strNotes as String = String.Empty
For int I = 3 To strCells.Length - 3
strNotes &= strCells(I)
Next
Row.StudentNotes = strNotes
它对我来说很好
答案 2 :(得分:0)
如果导入CSV文件不是例行程序