无法使用Microsoft.ACE.OLEDB.12检测从CSV批量导入的错误数据

时间:2016-05-24 21:33:40

标签: c# asp.net csv import oledb

例如,如果日期列包含添加的字母,则会将其视为空,我不会收到任何警告。

我用尽了所有Microsoft的文档,并且没有迹象表明可以更改此行为。只在谷歌的所有谷歌中发现了一篇与此相关的文章并且它说无法更改。

schema.ini是通过代码创建的,但这就是它的样子。

[NewEmployees.csv]
ColNameHeader=True
Format=CSVDelimited
DateTimeFormat=dd-MMM-yy
Col1=FirstName Text
Col2=LastName Text
Col3="Hire Date" Date

以下是最相关的代码行

string strSql = "SELECT * FROM [" + FileUpload1.FileName + "]";
                string strCSVConnString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + targetFolder + ";" + "Extended Properties='text;HDR=YES;'";
                OleDbDataAdapter oleda = new OleDbDataAdapter(strSql, strCSVConnString);
                DataTable importData = new DataTable();
                oleda.Fill(importData);

                GridView1.DataSource = importData;
                GridView1.DataBind();

如果有人想要整个ASP.Net代码而不是如下所示。它将允许用户在他们的计算机上选择一个文件,创建一个名称基于当前日期和时间的文件夹,创建一个schema.ini并将其保存到该文件夹​​,将上传的csv文件保存到该文件夹​​,而不是查询csv文件并将其绑定到gridview。这是很好的代码,但如果无法检测到错误数据则无用。

代码背后

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;

using System.IO;
using System.Data;
using System.Data.OleDb;

using System.Data.SqlClient;
using System.Data;

namespace WebApplication1
{
    public partial class EmployeeImport : System.Web.UI.Page
    {
        public string GetDateTimeStampedFolderName()
        {
            return string.Format("{0:yyyy-MM-dd_hh-mm-ss-tt}", DateTime.Now);
        }

        public void CreateSchemIni(string targetFolder, string fileName)
        {
            using (FileStream filestr = new FileStream(targetFolder + "/schema.ini", FileMode.Create, FileAccess.Write))
            {
                using (StreamWriter writer = new StreamWriter(filestr))
                {
                    writer.WriteLine("[" + FileUpload1.FileName + "]");
                    writer.WriteLine("ColNameHeader=True");
                    writer.WriteLine("Format=CSVDelimited");
                    writer.WriteLine("DateTimeFormat=dd-MMM-yy");
                    writer.WriteLine("Col1=FirstName Text");
                    writer.WriteLine("Col2=LastName Text");
                    writer.WriteLine("Col3=\"Hire Date\" Date");
                    writer.Close();
                    writer.Dispose();
                }
                filestr.Close();
                filestr.Dispose();
            }
        }

        private void UploadAndImport()
        {
            if (FileUpload1.HasFile)
            {
                string targetFolder = Server.MapPath("~/Uploads/Employees/" + GetDateTimeStampedFolderName());

                if (System.IO.Directory.Exists(targetFolder) == false)
                {
                    System.IO.Directory.CreateDirectory(targetFolder);
                }

                FileUpload1.SaveAs(Path.Combine(targetFolder, FileUpload1.FileName));

                CreateSchemIni(targetFolder, FileUpload1.FileName);

                string strSql = "SELECT * FROM [" + FileUpload1.FileName + "]";
                string strCSVConnString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + targetFolder + ";" + "Extended Properties='text;HDR=YES;'";
                OleDbDataAdapter oleda = new OleDbDataAdapter(strSql, strCSVConnString);
                DataTable importData = new DataTable();
                oleda.Fill(importData);

                GridView1.DataSource = importData;
                GridView1.DataBind();
            }
        }

        protected void UploadButton_Click(object sender, EventArgs e)
        {
            if (FileUpload1.HasFile)
            {
                UploadAndImport();
            }
        }
    }
}

ASPX

<%@ Page Language="C#" AutoEventWireup="true" CodeBehind="EmployeeImport.aspx.cs" Inherits="WebApplication1.EmployeeImport" %>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
    <title></title>
</head>
<body>
    <form id="form1" runat="server">
    <div>

         <asp:FileUpload ID="FileUpload1" runat="server" />

        <br />
        <asp:Button ID="UploadButton" runat="server" Text="Upload" 
            onclick="UploadButton_Click" />
        <asp:GridView ID="GridView1" runat="server">
        </asp:GridView>

    </div>
    </form>
</body>
</html>

2 个答案:

答案 0 :(得分:0)

try
{
    oleda.Fill(importData);
}
catch(Exception) // put break point here
{
    throw;
}

看看你现在是否有任何例外

答案 1 :(得分:0)

使用Microsoft.ACE.OLEDB.12.0和schema.ini导入数据可能会导致2个重要的无声但致命的问题。我正在发布两者的解决方案。虽然其中一个仅适用于SQL Server,但类似的解决方案可能适用于其他数据库。

  1. 错误的数据,例如带有#34; 5/20 / 2016a&#34;等字母的日期。将会 转换为null,它不会抛出异常或警告 那个会发生。它会愉快地破坏你的数据。
  2. 在schema.ini中指定列类型是由它的序号完成的 位置并将完全忽略CSV中的标题。如果CSV中的列不按顺序,您将获得异常或警告。并且您的数据将被破坏。
  3. 例如,如果schema.ini包含:

    Col1=FirstName Text
    Col2=LastName Text
    Col3="Hire Date" Date
    

    CSV的FirstName,LastName顺序错误:

    LastName,FirstName,HireDate
    Smith,Jon,5/1/2016
    Moore,Larry,5/15/2016
    

    ACE驱动程序不够智能,无法识别标头乱序,数据导入错误。

    解决问题1 - 错误数据

    我提出的解决方案是使用schema.ini将所有列指定为文本字段,并使用System.Data.SqlClient.SqlBulkCopy将数据导入SQL Server。当SQLBulkCopy发现错误数据时,即使只有最后一条记录是坏的,它也足够聪明地抛出异常并阻止导入整个CSV。

    问题2的解决方案 - CSV列无序,或包含丢失/额外的列

    为了解决这个问题,我创建了2个DataTables,其中一个填充了模式而没有数据。只有模式填充的那个必须在创建schema.ini之前完成,因为一旦创建schema.ini,CSV中的标题将被忽略。

    DataTable importData = new DataTable();
    DataTable importDataSourceSchema = new DataTable();
    
    // Fill the schema prior to creating the schema.ini, as this is the only way to get the headers from the CSV
    oleda.FillSchema(importDataSourceSchema, System.Data.SchemaType.Source);
    CreateSchemIni(targetFolder, FileUpload1.FileName);
    oleda.Fill(importData);
    

    然后我创建了一个函数,用于验证CSV中的标题是否按正确的顺序排列,并且CSV包含正确的列数:

    private bool ValidateHeaders(DataTable importData, DataTable importDataSourceSchema)
    {
        bool isValid = true;
    
        if (importData.Columns.Count != importDataSourceSchema.Columns.Count)
        {
            isValid = false;
            ValidationLabel.Text = ValidationLabel.Text + "<br />Wrong number of columns";
        }
    
        for (int i = 0; i < importData.Columns.Count; i++)
        {
            if (importData.Columns[i].ColumnName != importDataSourceSchema.Columns[i].ColumnName)
            {
                ValidationLabel.Text = ValidationLabel.Text + "<br />Error finding column " + importData.Columns[i].ColumnName;
                isValid = false;
            }
        }
        return isValid;
    }
    

    然后我在执行批量导入之前调用ValidateHeaders

    if (ValidateHeaders(importData, importDataSourceSchema))
    {
        using (SqlBulkCopy bulkCopy = new SqlBulkCopy([Add your ConnectionString here]))
        {
            bulkCopy.DestinationTableName = "dbo.EmployeeImport";
            bulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("FirstName", "FirstName"));
            bulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("LastName", "LastName"));
            bulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("Hire Date", "HireDate"));
            try
            {
                bulkCopy.WriteToServer(importData);
                ValidationLabel.Text = "Success";
                GridView1.DataSource = importData;
                GridView1.DataBind();
            }
            catch (Exception e)
            {
                ValidationLabel.Text = e.Message;
            }
        }
    }
    

    以下是为ASP.NET WebForms编写的概念代码的完整证明

    <强> ASPX

    <%@ Page Language="C#" AutoEventWireup="true" CodeBehind="EmployeeImport.aspx.cs" Inherits="WebApplication1.EmployeeImport" %>
    
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head runat="server">
        <title></title>
    </head>
    <body>
        <form id="form1" runat="server">
        <div>
    
             <asp:FileUpload ID="FileUpload1" runat="server" />
    
            <br />
            <asp:Button ID="UploadButton" runat="server" Text="Upload" 
                onclick="UploadButton_Click" />
    
            <br />
            Data Imported: <asp:Label ID="ValidationLabel" runat="server" ForeColor="Red"></asp:Label>
            <asp:GridView ID="GridView1" runat="server">
            </asp:GridView>
    
        </div>
        </form>
    </body>
    </html>
    

    代码背后

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Web;
    using System.Web.UI;
    using System.Web.UI.WebControls;
    
    using System.IO;
    using System.Data;
    using System.Data.OleDb;
    
    using System.Data.SqlClient;
    using System.Data;
    
    namespace WebApplication1
    {
        public partial class EmployeeImport : System.Web.UI.Page
        {
            public string GetDateTimeStampedFolderName()
            {
                return string.Format("{0:yyyy-MM-dd_hh-mm-ss-tt}", DateTime.Now);
            }
    
            public void CreateSchemIni(string targetFolder, string fileName)
            {
                using (FileStream filestr = new FileStream(targetFolder + "/schema.ini", FileMode.Create, FileAccess.Write))
                {
                    using (StreamWriter writer = new StreamWriter(filestr))
                    {
                        writer.WriteLine("[" + FileUpload1.FileName + "]");
                        writer.WriteLine("ColNameHeader=True");
                        writer.WriteLine("Format=CSVDelimited");
                        writer.WriteLine("Col1=FirstName Text");
                        writer.WriteLine("Col2=LastName Text");
                        writer.WriteLine("Col3=\"Hire Date\" Text");
                        writer.Close();
                        writer.Dispose();
                    }
                    filestr.Close();
                    filestr.Dispose();
                }
            }
    
            private bool ValidateHeaders(DataTable importData, DataTable importDataSourceSchema)
            {
    
                bool isValid = true;
    
                if (importData.Columns.Count != importDataSourceSchema.Columns.Count)
                {
                    isValid = false;
                    ValidationLabel.Text = ValidationLabel.Text + "<br />Wrong number of columns";
                }
    
                for (int i = 0; i < importData.Columns.Count; i++)
                {
                    if (importData.Columns[i].ColumnName != importDataSourceSchema.Columns[i].ColumnName)
                    {
                        ValidationLabel.Text = ValidationLabel.Text + "<br />Error finding column " + importData.Columns[i].ColumnName;
                        isValid = false;
                    }
                }
    
                return isValid;
            }
    
            private void UploadAndImport()
            {
                if (FileUpload1.HasFile)
                {
                    string targetFolder = Server.MapPath("~/Uploads/Employees/" + GetDateTimeStampedFolderName());
    
                    if (System.IO.Directory.Exists(targetFolder) == false)
                    {
                        System.IO.Directory.CreateDirectory(targetFolder);
                    }
    
                    FileUpload1.SaveAs(Path.Combine(targetFolder, FileUpload1.FileName));
    
    
    
                    string strSql = "SELECT * FROM [" + FileUpload1.FileName + "]";
                    string strCSVConnString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + targetFolder + ";" + "Extended Properties='text;HDR=YES;'";
                    using (OleDbDataAdapter oleda = new OleDbDataAdapter(strSql, strCSVConnString))
                    {
                        DataTable importData = new DataTable();
                        DataTable importDataSourceSchema = new DataTable();
    
                        // Fill the schema prior to creating the schema.ini, as this is the only way to get the headers from the CSV
                        oleda.FillSchema(importDataSourceSchema, System.Data.SchemaType.Source);
                        CreateSchemIni(targetFolder, FileUpload1.FileName);
                        oleda.Fill(importData);
    
                        if (ValidateHeaders(importData, importDataSourceSchema))
                        {
                            using (SqlBulkCopy bulkCopy = new SqlBulkCopy([Add your ConnectionString here], SqlBulkCopyOptions.TableLock))
                            {
                                bulkCopy.DestinationTableName = "dbo.EmployeeImport";
                                bulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("FirstName", "FirstName"));
                                bulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("LastName", "LastName"));
                                bulkCopy.ColumnMappings.Add(new SqlBulkCopyColumnMapping("Hire Date", "HireDate"));
                                try
                                {
                                    bulkCopy.WriteToServer(importData);
                                    ValidationLabel.Text = "Success";
                                    GridView1.DataSource = importData;
                                    GridView1.DataBind();
                                }
                                catch (Exception e)
                                {
                                    ValidationLabel.Text = e.Message;
                                }
                            }
    
    
                        }
                    }
                }
            }
    
            protected void UploadButton_Click(object sender, EventArgs e)
            {
                if (FileUpload1.HasFile)
                {
                    ValidationLabel.Text = "";
                    UploadAndImport();
                }
            }
        }
    }