如何通过逗号读取CSV文件,除非它是字段的一部分

时间:2014-06-06 19:07:31

标签: c# sql sql-server csv

我有以下C#代码读取CSV文件,目标是将其保存到SQL表中:

StreamReader sr = new StreamReader(tbCSVFileLocation.Text.ToString());
string line = sr.ReadLine();
string[] value = line.Split(',');
DataTable dt = new DataTable();
DataRow row;

foreach (string dc in value)
{
  dt.Columns.Add(new DataColumn(dc));
}

while (!sr.EndOfStream)
{
  value = sr.ReadLine().Split(',');
  if (value.Length == dt.Columns.Count)
  {
    row = dt.NewRow();
    row.ItemArray = value;
    dt.Rows.Add(row);
  }
}

我遇到的问题是我不知道数据来自我的表格。

以下是CSV文件的示例:

Name,Address,License Number,License Type,Year of Birth,Effective Date,Action,Misconduct Description,Date Updated "563 Grand Medical, P.C.","563 Grand Street Brooklyn, NY 11211",196275,,,09/29/2010,Revocation of certificate of incorporation.,"The corporation admitted guilt to the charge of ordering excessive tests, treatment, or use of treatment facilities not warranted by the condition of a patient.",09/29/2010 "Aaron, Joseph","2803 North 700 East Provo, Utah 84604",072800,MD,1927,01/13/1999,License Surrender,"This action modifies the penalty previously imposed by Order# 93-40 on March 31, 1993, where the Hearing Committee sustained the charge that the physician was disciplined by the Utah State Medical Board, and ordered that if he intends to engage in practice in NY State, a two-year period of probation shall be imposed.", "Aarons, Mark Gold","P.O.Box 845 Southern Pines, North Carolina 28388",161530,MD,1958,12/13/2005,"License limited until the physician's North Carolina medical license is fully restored without any conditions.The physician must also comply with the terms imposed on July 26, 2005 by the North Carolina State Medical Board. The physician has completed the monitoring terms.",The physician did not contest the charge of having been disciplined by the North Carolina State Medical Board for his addiction to drugs.,12/06/2005

当我查看我的SQL表时,这就是显示的内容:

Name    Address License Number  License Type    Year of Birth   Effective Date  Action  Misconduct Description  Date Updated                    
Orlando  FL 32836"  173309  MD  1938    2/29/2012   License surrender   The physician did not contest the charge of having had his DEA registration for Florida revoked by the U.S. Drug Enforcement Administration for improperly prescribing controlled substances.   2/22/2012                   
Miami    Florida 33156" 119545  MD  1945    10/10/2002  Censure and reprimand   The physician did not contest the charge of having been disciplined by the Florida State Board of Medicine for giving a patient excessive doses of radiation.   10/10/2002                  
Brooklyn     New York 11229"    192310          11/6/2003   Annulment of certificate of incorporation pursuant to Section 230-a of the New York State Public Health Law and Section 1503(d) of the New York State Business Corporation Law  The corporation admitted guilt to the charge of willfully failing to comply with Section 1503 of the Business Corporation Law in violation of New York State Education Law Section 6530(12).    10/31/2003                  

正如您所看到的,第一行的第一列没有ORLANDO。不知道发生了什么。

请帮我解决。

2 个答案:

答案 0 :(得分:3)

一些应该帮助您入门的代码..也可以使用Debugger逐步执行代码

  

声明受保护的静态DataTable csvData并最初将其指定为null

protected static DataTable csvData = null; // declared up top in your class
csvData = GetDataTabletFromCSVFile(fileName); //Converts the CSV File into a DataTable

private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
    csvData = new DataTable(defaultTableName);
    try
    {
        using (TextFieldParser csvReader = new TextFieldParser(csv_file_path))
        {
            csvReader.SetDelimiters(new string[]
            {
                tableDelim 
            });
            csvReader.HasFieldsEnclosedInQuotes = true;
            string[] colFields = csvReader.ReadFields();
            foreach (string column in colFields)
            {
                DataColumn datecolumn = new DataColumn(column);
                datecolumn.AllowDBNull = true;
                csvData.Columns.Add(datecolumn);
            }

            while (!csvReader.EndOfData)
            {
                string[] fieldData = csvReader.ReadFields();
                //Making empty value as null
                for (int i = 0; i < fieldData.Length; i++)
                {
                    if (fieldData[i] == string.Empty)
                    {
                        fieldData[i] = string.Empty; //fieldData[i] = null
                    }
                    //Skip rows that have any csv header information or blank rows in them
                    if (fieldData[0].Contains("Disclaimer") || string.IsNullOrEmpty(fieldData[0]))
                    {
                        continue;
                    }
                }
                csvData.Rows.Add(fieldData);
            }
        }
    }
    catch (Exception ex)
    {
    }
    return csvData;
}
  

fieldData [0] .Contains(&#34; Disclaimer&#34;)这是我.csv文件中的列,因此请直接阅读并理解逻辑,并根据需要进行更改以适合您的.csv文件

如果你想尝试更轻松的事情,然后解析&#34; \&#34;使用“快速查看”窗口时将获得的字符尝试此

var lines = File.ReadLines("FilePath of Some .csv File").Select(a => a.Split(',')).ToArray(); 

答案 1 :(得分:1)

在CodeProject上使用Sebastien Loren的Fast CSV Reader而不是自己动手。

出于原因,请参阅The Comma Separated Value (CSV) File Format: Create or parse data in this popular pseudo-standard format。将CSV称为“标准格式”是对“标准”一词的嘲弄。

你的问题更糟,因为看起来你有带嵌入式行尾标记的引用字段。现在你必须解决你是否有截断记录或跨越多行的记录的问题。

如果您对数据来源​​有任何控制权,请考虑切换为使用JSON,XML或其他一些准基本格式进行数据交换。