导入两个CSV,从一个CSV添加特定列并将更改导入新CSV(C#)

时间:2018-02-13 14:36:03

标签: c# csv

我必须导入2个CSV。

CSV 1 [49] :包括大约50个制表符分隔的列。 CSV 2:[2] 包括应在我的第一个csv的[3] [6]和[11]位置替换的3列。

所以我继续做什么:

1)导入csv并拆分成数组。

string employeedatabase = "MYPATH";


List<String> status = new List<String>();

StreamReader file2 = new System.IO.StreamReader(filename);
string line = file2.ReadLine();
while ((line = file2.ReadLine()) != null)
{
    string[] ud = line.Split('\t');
    status.Add(ud[0]);

}

String[] ud_status = status.ToArray();

问题1:我有大约50个colums要处理,ud_status只是第一个,所以我需要50个列表和50个字符串数组吗?

2)导入第二个csv并拆分成一个数组。

List<String> vorname = new List<String>();
List<String> nachname = new List<String>();
List<String> username = new List<String>();

StreamReader file = new System.IO.StreamReader(employeedatabase);
string line3 = file.ReadLine();
while ((line3 = file.ReadLine()) != null)
{
    string[] data = line3.Split(';');
    vorname.Add(data[0]);
    nachname.Add(data[1]);
    username.Add(data[2]);
}

String[] db_vorname = vorname.ToArray();
String[] db_nachname = nachname.ToArray();
String[] db_username = username.ToArray();

问题2:加载这两个csv之后我不知道如何组合它们,并改变如上所述的列..

有人这样吗?

mynewArray = ud_status + "/t" + ud_xy[..n] + "/t" + changed_colum + ud_xy[..n];

将“mynewarray”保存到tablulator分离的csv中,编码为“utf-8”。

3 个答案:

答案 0 :(得分:0)

要将文件读入有意义的格式,您应该设置一个定义CSV格式的类:

public class CsvRow
{
    public string vorname { get; set; }
    public string nachname { get; set; }
    public string username { get; set; }
    public CsvRow (string[] data)
    {
         vorname = data[0];
         nachname = data[1];
         username = data[2];
    }
}

然后填写一个列表:

List<CsvRow> rows = new List<CsvRow>();

StreamReader file = new System.IO.StreamReader(employeedatabase);
string line3 = file.ReadLine();
while ((line3 = file.ReadLine()) != null)
{
    rows.Add(new CsvRow(line3.Split(';'));
}

同样格式化您的其他CSV并包含新字段的未使用属性。一旦加载了两者,就可以在循环中填充此列表中的新属性,将记录与CSV希望共享的任何公共字段进行匹配。然后最终将结果数据输出到新的CSV文件。

答案 1 :(得分:0)

您的解决方案不是使用string数组来执行此操作。这只会让你发疯。最好使用System.Data.DataTable对象。

我没有机会在这个结尾处测试LINQ lambda表达式(或者实际上是任何一个,我在休息时写了这个),但它应该让你走上正确的轨道。

using (var ds = new System.Data.DataSet("My Data"))
        {
            ds.Tables.Add("File0");
            ds.Tables.Add("File1");
            string[] line;
            using (var reader = new System.IO.StreamReader("FirstFile"))
            {                       
                //first we get columns for table 0                    
                foreach (string s in reader.ReadLine().Split('\t'))
                    ds.Tables["File0"].Columns.Add(s);
                while ((line = reader.ReadLine().Split('\t')) != null)
                {
                    //and now the rest of the data. 
                    var r = ds.Tables["File0"].NewRow();
                    for (int i = 0; i <= line.Length; i++)
                    {
                        r[i] = line[i];
                    }
                    ds.Tables["File0"].Rows.Add(r);
                }                   
            }
            //we could probably do these in a loop or a second method,
            //but you may want subtle differences, so for now we just do it the same way 
            //for file1
            using (var reader2 = new System.IO.StreamReader("SecondFile"))
            {
                foreach (string s in reader2.ReadLine().Split('\t'))
                    ds.Tables["File1"].Columns.Add(s);
                while ((line = reader2.ReadLine().Split('\t')) != null)
                {
                    //and now the rest of the data. 
                    var r = ds.Tables["File1"].NewRow();
                    for (int i = 0; i <= line.Length; i++)
                    {
                        r[i] = line[i];
                    }
                    ds.Tables["File1"].Rows.Add(r);
                }
            }
            //you now have these in functioning datatables. Because we named columns, 
            //you can call them by name specifically, or by index, to replace in the first datatable. 
            string[] columnsToReplace = new string[] { "firstColumnName", "SecondColumnName", "ThirdColumnName" };
            for(int i = 0; i < ds.Tables[0].Rows.Count; i++)
            {
                //you didn't give a sign of any relation between the two tables
                //so this is just by row, and assumes the row count is equivalent.
                //This is also not advised. 
                //if there is a key these sets of data share
                //you should join on them instead. 
                foreach(DataRow dr in ds.Tables[0].Rows[i].ItemArray)
                {
                    dr[3] = ds.Tables[1].Rows[i][columnsToReplace[0]];
                    dr[6] = ds.Tables[1].Rows[i][columnsToReplace[1]];
                    dr[11] = ds.Tables[1].Rows[i][columnsToReplace[2]];
                }
            }
            //ds.Tables[0] now has the output you want.  
            string output = String.Empty;
            foreach (var s in ds.Tables[0].Columns)
               output = String.Concat(output, s ,"\t");
            output = String.Concat(output, Environment.NewLine); // columns ready, now the rows. 
            foreach (DataRow r in ds.Tables[0].Rows)
               output = string.Concat(output, r.ItemArray.SelectMany(t => (t.ToString() + "\t")), Environment.NewLine);
            if(System.IO.File.Exists("MYPATH"))
                using (System.IO.StreamWriter file = new System.IO.StreamWriter("MYPATH")) //or a variable instead of string literal
                {                  
                    file.Write(output);
                }

        }

答案 2 :(得分:0)

使用Cinchoo ETL - 一个开源文件助手库,您可以按如下方式合并CSV文件。假设2个CSV文件包含相同数量的行。

string CSV1 = @"Id  Name    City
1   Tom New York
2   Mark    FairFax";

string CSV2 = @"Id  City
1   Las Vegas
2   Dallas";

dynamic rec1 = null;
dynamic rec2 = null;
StringBuilder csv3 = new StringBuilder();
using (var csvOut = new ChoCSVWriter(new StringWriter(csv3))
    .WithFirstLineHeader()
    .WithDelimiter("\t")
    )
{
    using (var csv1 = new ChoCSVReader(new StringReader(CSV1))
        .WithFirstLineHeader()
        .WithDelimiter("\t")
        )
    {
        using (var csv2 = new ChoCSVReader(new StringReader(CSV2))
            .WithFirstLineHeader()
            .WithDelimiter("\t")
            )
        {
            while ((rec1 = csv1.Read()) != null && (rec2 = csv2.Read()) != null)
            {
                rec1.City = rec2.City;
                csvOut.Write(rec1);
            }
        }
    }
}
Console.WriteLine(csv3.ToString());

希望它有所帮助。

免责声明:我是这个图书馆的作者。