Question

我目前正在开展一个小型项目，我遇到了一个目前无法解决的问题......

我想要阅读多个“.CSV”文件，它们都具有相同的数据，只有不同的值。

Header1;Value1;Info1
Header2;Value2;Info2
Header3;Value3;Info3

在阅读我需要创建标题的第一个文件时。问题是它们不是在列中分割而是在行中分割（正如您在Header1-Header3上面看到的那样）。

然后它需要读取值1 - 值3（它们列在第2列中）并且最重要的是我需要创建另一个标题 - ＆gt;带有“Info2”数据的Header4，它始终位于第3列和第2行（第3列的其他值可以忽略）。

所以第一个文件后的结果应该是这样的：

Header1;Header2;Header3;Header4;
Value1;Value2;Value3;Info2;

在多个文件之后，它会像这样：

Header1;Header2;Header3;Header4;
Value1;Value2;Value3;Value4;
Value1b;Value2b;Value3b;Value4b;
Value1c;Value2c;Value3c;Value4c;

我尝试使用OleDB，但我得到错误“缺少ISAM”，我无法修复。我使用的代码如下：

public DataTable ReadCsv(string fileName)
    {
        DataTable dt = new DataTable("Data");
       /* using (OleDbConnection cn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"" + 
            Path.GetDirectoryName(fileName) + "\";Extendet Properties ='text;HDR=yes;FMT=Delimited(,)';"))
        */
        using (OleDbConnection cn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
            Path.GetDirectoryName(fileName) + ";Extendet Properties ='text;HDR=yes;FMT=Delimited(,)';"))
        {
            using(OleDbCommand cmd = new OleDbCommand(string.Format("select *from [{0}]", new FileInfo(fileName).Name,cn)))
            {
                cn.Open();
                using(OleDbDataAdapter adapter = new OleDbDataAdapter(cmd))
                {
                    adapter.Fill(dt);
                }
            }
        }


        return dt;
    }

我做的另一个尝试是使用StreamReader。但是标题是在错误的地方，我不知道如何更改此+为每个文件执行此操作。我试过的代码如下：

  public static DataTable ReadCsvFilee(string path)
    {  

        DataTable oDataTable = new DataTable();
        var fileNames = Directory.GetFiles(path);
        foreach (var fileName in fileNames)

        {

            //initialising a StreamReader type variable and will pass the file location
            StreamReader oStreamReader = new StreamReader(fileName);

            // CONTROLS WHETHER WE SKIP A ROW OR NOT
            int RowCount = 0;
            // CONTROLS WHETHER WE CREATE COLUMNS OR NOT
            bool hasColumns = false;
            string[] ColumnNames = null;
            string[] oStreamDataValues = null;
            //using while loop read the stream data till end
            while (!oStreamReader.EndOfStream)
            { 

                String oStreamRowData = oStreamReader.ReadLine().Trim();
                if (oStreamRowData.Length > 0)
                { 

                    oStreamDataValues = oStreamRowData.Split(';');
                    //Bcoz the first row contains column names, we will poluate 
                    //the column name by
                    //reading the first row and RowCount-0 will be true only once
                    // CHANGE TO CHECK FOR COLUMNS CREATED                      
                    if (!hasColumns)
                    {
                        ColumnNames = oStreamRowData.Split(';');

                        //using foreach looping through all the column names
                        foreach (string csvcolumn in ColumnNames)
                        {
                            DataColumn oDataColumn = new DataColumn(csvcolumn.ToUpper(), typeof(string));

                            //setting the default value of empty.string to newly created column
                            oDataColumn.DefaultValue = string.Empty;

                            //adding the newly created column to the table
                            oDataTable.Columns.Add(oDataColumn);
                        }
                        // SET COLUMNS CREATED
                        hasColumns = true;
                        // SET RowCount TO 0 SO WE KNOW TO SKIP COLUMNS LINE
                        RowCount = 0;
                    }
                    else
                    {
                        // IF RowCount IS 0 THEN SKIP COLUMN LINE
                        if (RowCount++ == 0) continue;
                        //creates a new DataRow with the same schema as of the oDataTable            
                        DataRow oDataRow = oDataTable.NewRow();

                        //using foreach looping through all the column names
                        for (int i = 0; i < ColumnNames.Length; i++)
                        {
                            oDataRow[ColumnNames[i]] = oStreamDataValues[i] == null ? string.Empty : oStreamDataValues[i].ToString();
                        }

                        //adding the newly created row with data to the oDataTable       
                        oDataTable.Rows.Add(oDataRow);
                    }

                }
            }
            //close the oStreamReader object
            oStreamReader.Close();
            //release all the resources used by the oStreamReader object
            oStreamReader.Dispose();
        }
            return oDataTable;
        }

我感谢所有愿意帮助的人。感谢您阅读这篇文章！

真诚的你

Answer 1

如果我理解你的话，那就有严格的解析：

string OpenAndParse(string filename, bool firstFile=false)
{
    var lines = File.ReadAllLines(filename);

    var parsed = lines.Select(l => l.Split(';')).ToArray();

    var header = $"{parsed[0][0]};{parsed[1][0]};{parsed[2][0]};{parsed[1][0]}\n";
    var data   = $"{parsed[0][1]};{parsed[1][1]};{parsed[2][1]};{parsed[1][2]}\n";

    return firstFile
    ? $"{header}{data}"
    : $"{data}";
}

它将返回的位置 - 如果是第一个文件：

Header1;Header2;Header3;Header2
Value1;Value2;Value3;Value4

如果不是第一个文件：

Value1;Value2;Value3;Value4

如果我是正确的，那么休息就是针对文件的列表文件运行它并将结果连接到输出文件中。

编辑：对目录：

void ProcessFiles(string folderName, string outputFileName)
{
    bool firstFile = true;
    foreach (var f in Directory.GetFiles(folderName))
    {
        File.AppendAllText(outputFileName, OpenAndParse(f, firstFile));
        firstFile = false;
    }
}

注意：我想你想要一个DataTable而不是输出文件。然后你可以简单地创建一个列表并将结果放入该列表中，使列表成为数据表的数据源（那么为什么你会在那里使用分号？可能你只需要将数组值附加到列表中）。

Answer 2

我不知道这是否是最好的方法。但在我的情况下，我会做的是在读取所有文件时重写CSV的常规方式，然后创建一个包含所创建的新CSV的流。

它看起来像这样：

     var csv = new StringBuilder();
            csv.AppendLine("Header1;Header2;Header3;Header4");
            foreach (var item in file)
            {
                var newLine = string.Format("{0},{1},{2},{3}", item.value1, item.value2, item.value3, item.value4);
                csv.AppendLine(newLine);
            }

            //Create Stream
            MemoryStream stream = new MemoryStream();
            StreamReader reader = new StreamReader(stream);

            //Fill your data table here with your values

希望这会有所帮助。

Answer 3

（添加另一个答案只是为了让它整洁）

void ProcessMyFiles(string folderName)
{
    List<MyData> d = new List<MyData>();
    var files = Directory.GetFiles(folderName);
    foreach (var file in files)
    {
        OpenAndParse(file, d);
    }

    string[] headers = GetHeaders(files[0]);
    DataGridView dgv = new DataGridView {Dock=DockStyle.Fill};
    dgv.DataSource = d;
    dgv.ColumnAdded += (sender, e) => {e.Column.HeaderText = headers[e.Column.Index];};

    Form f = new Form();
    f.Controls.Add(dgv);
    f.Show();
}

string[] GetHeaders(string filename)
{
    var lines = File.ReadAllLines(filename);
    var parsed = lines.Select(l => l.Split(';')).ToArray();
    return new string[] { parsed[0][0], parsed[1][0], parsed[2][0], parsed[1][0] };
}

void OpenAndParse(string filename, List<MyData> d)
{
    var lines = File.ReadAllLines(filename);
    var parsed = lines.Select(l => l.Split(';')).ToArray();
    var data = new MyData
    {
        Col1 = parsed[0][1],
        Col2 = parsed[1][1],
        Col3 = parsed[2][1],
        Col4 = parsed[1][2]
    };
    d.Add(data);
}

public class MyData
{
    public string Col1 { get; set; }
    public string Col2 { get; set; }
    public string Col3 { get; set; }
    public string Col4 { get; set; }
}

C＃将CSV读取到DataTable并调用行/列

3 个答案: