CSV解析器通过OLEDB解析双引号

时间:2011-10-21 19:51:05

标签: c# parsing csv datatable

如何使用OLEDB解析和导入每个单元格都用双引号括起来的CSV文件,因为有些行包含逗号?我无法更改格式,因为它来自供应商。

我正在尝试以下操作,并且失败并出现IO错误:

public DataTable ConvertToDataTable(string fileToImport, string fileDestination)
{
    string fullImportPath = fileDestination + @"\" + fileToImport;
    OleDbDataAdapter dAdapter = null;
    DataTable dTable = null;

    try
    {
        if (!File.Exists(fullImportPath))
            return null;

        string full = Path.GetFullPath(fullImportPath);
        string file = Path.GetFileName(full);
        string dir = Path.GetDirectoryName(full);


        //create the "database" connection string
        string connString = "Provider=Microsoft.Jet.OLEDB.4.0;"
          + "Data Source=\"" + dir + "\\\";"
          + "Extended Properties=\"text;HDR=No;FMT=Delimited\"";

        //create the database query
        string query = "SELECT * FROM " + file;

        //create a DataTable to hold the query results
        dTable = new DataTable();

        //create an OleDbDataAdapter to execute the query
        dAdapter = new OleDbDataAdapter(query, connString);


        //fill the DataTable
        dAdapter.Fill(dTable);
    }
    catch (Exception ex)
    {
        throw new Exception(CLASS_NAME + ".ConvertToDataTable: Caught Exception: " + ex);
    }
    finally
    {
        if (dAdapter != null)
            dAdapter.Dispose();
    }

    return dTable;
}

当我使用普通的CSV时,它可以正常工作。我是否需要更改connString中的内容?

8 个答案:

答案 0 :(得分:3)

使用专用的CSV解析器。

那里有很多人。一个受欢迎的版本是FileHelpers,但Microsoft.VisualBasic.FileIO命名空间中隐藏了一个 - TextFieldParser

答案 1 :(得分:1)

查看FileHelpers.

答案 2 :(得分:1)

您可以使用此代码:MS office required

  private void ConvertCSVtoExcel(string filePath = @"E:\nucc_taxonomy_140.csv", string tableName = "TempTaxonomyCodes")
    {
        string tempPath = System.IO.Path.GetDirectoryName(filePath);
        string strConn = @"Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + tempPath + @"\;Extensions=asc,csv,tab,txt";
        OdbcConnection conn = new OdbcConnection(strConn);
        OdbcDataAdapter da = new OdbcDataAdapter("Select * from " + System.IO.Path.GetFileName(filePath), conn);
        DataTable dt = new DataTable();
        da.Fill(dt);

        using (SqlBulkCopy bulkCopy = new SqlBulkCopy(ConfigurationSettings.AppSettings["dbConnectionString"]))
        {
            bulkCopy.DestinationTableName = tableName;
            bulkCopy.BatchSize = 50;
            bulkCopy.WriteToServer(dt);
        }

    }

答案 3 :(得分:1)

处理CSV文件时需要考虑很多事项。但是,您从文件中提取它们,您应该知道如何处理解析。有些课程可以让你分道扬but,但是大多数课程都没有处理Excel使用嵌入式逗号,引号和换行符所做的细微差别。但是,如果您只想解析像CSV这样的txt文件,那么加载Excel或MS类似乎会带来很多开销。

您可以考虑的一件事是在您自己的Regex中进行解析,这也将使您的代码更加独立于平台,以防您需要在某些时候将其移植到另一台服务器或应用程序。使用正则表达式的好处是几乎每种语言都可以访问。也就是说,有一些很好的正则表达式模式可以处理CSV拼图。这是我的镜头,它包括嵌入的逗号,引号和换行符。正则表达式代码/模式和解释:

http://www.kimgentes.com/worshiptech-web-tools-page/2008/10/14/regex-pattern-for-parsing-csv-files-with-embedded-commas-dou.html

希望有所帮助。

答案 4 :(得分:0)

尝试我的答案中的代码:

Reading CSV files in C#

它处理引用的csv就好了。

答案 5 :(得分:0)

 private static void Mubashir_CSVParser(string s)
        {
            // extract the fields
            Regex RegexCSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
            String[] Fields = RegexCSVParser.Split(s);

            // clean up the fields (remove " and leading spaces)
            for (int i = 0; i < Fields.Length; i++)
            {
                Fields[i] = Fields[i].TrimStart(' ', '"');
                Fields[i] = Fields[i].TrimEnd('"');// this line remove the quotes
                //Fields[i] = Fields[i].Trim();
            }
        }

答案 6 :(得分:-1)

只是因为任何人有类似的问题,我想发布我使用的代码。我最终使用Textparser来获取文件并解析列,但我使用了recneath来完成剩下的工作和子串。

 /// <summary>
        /// Parses each string passed as a "row".
        /// This routine accounts for both double quotes
        /// as well as commas currently, but can be added to
        /// </summary>
        /// <param name="row"> string or row to be parsed</param>
        /// <returns></returns>
        private List<String> ParseRowToList(String row)
        {
            List<String> returnValue = new List<String>();

            if (row[0] == '\"')
            {// Quoted String
                if (row.IndexOf("\",") > -1)
                {// There are more columns
                    returnValue = ParseRowToList(row.Substring(row.IndexOf("\",") + 2));
                    returnValue.Insert(0, row.Substring(1, row.IndexOf("\",") - 1));
                }
                else
                {// This is the last column
                    returnValue.Add(row.Substring(1, row.Length - 2));
                }
            }
            else
            {// Unquoted String
                if (row.IndexOf(",") > -1)
                {// There are more columns
                    returnValue = ParseRowToList(row.Substring(row.IndexOf(",") + 1));
                    returnValue.Insert(0, row.Substring(0, row.IndexOf(",")));
                }
                else
                {// This is the last column
                    returnValue.Add(row.Substring(0, row.Length));
                }
            }

            return returnValue;

        }

然后Textparser的代码是:

 // string pathFile = @"C:\TestFTP\TestCatalog.txt";
            string pathFile = @"C:\TestFTP\SomeFile.csv";

            List<String> stringList = new List<String>();
            TextFieldParser fieldParser = null;
            DataTable dtable = new DataTable();

            /* Set up TextFieldParser
                *  use the correct delimiter provided
                *  and path */
            fieldParser = new TextFieldParser(pathFile);
            /* Set that there are quotes in the file for fields and or column names */
            fieldParser.HasFieldsEnclosedInQuotes = true;

            /* delimiter by default to be used first */
            fieldParser.SetDelimiters(new string[] { "," });

            // Build Full table to be imported
            dtable = BuildDataTable(fieldParser, dtable);

答案 7 :(得分:-1)

这是我在项目中使用的,解析单行数据。

    private string[] csvParser(string csv, char separator = ',')
    {
        List <string> parsed = new List<string>();
        string[] temp = csv.Split(separator);
        int counter = 0;
        string data = string.Empty;
        while (counter < temp.Length)
        {
            data = temp[counter].Trim();
            if (data.Trim().StartsWith("\""))
            {
                bool isLast = false;
                while (!isLast && counter < temp.Length)
                {
                    data += separator.ToString() + temp[counter + 1];
                    counter++;
                    isLast = (temp[counter].Trim().EndsWith("\""));
                }
            }
            parsed.Add(data);
            counter++;
        }

        return parsed.ToArray();

    }

http://zamirsblog.blogspot.com/2013/09/c-csv-parser-csvparser.html