查找并替换2个CSV文件之间的记录C#

时间:2015-11-28 11:27:13

标签: c# .net csv

我需要构建一个方法来增强一个csv文件的值来自另一个。这种方法需要:

  • 采取"原创" csv文件
  • 对于其第0列的每一行,在"增强"的第0列中查找匹配的记录。 csv文件
  • 如果匹配,则对于此行,记录在"原始"的第1列中文件将被"增强"的第1列中的相应记录覆盖。文件

我尝试了以下模式,这似乎是可行的 - 但它太慢了,我甚至无法检查它。文件的大小应该不是问题,因为一个是1MB,另一个是2MB,但我确实采取了一些错误的假设来有效地做到这一点。什么是更好的方法呢?

public static string[] LoadReadyCsv()
        {
            string[] scr = System.IO.File.ReadAllLines(@Path...CsvScr);
            string[] aws = System.IO.File.ReadAllLines(@Path...CsvAws);
            Regex CSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");

            foreach (var s in scr)
            {
                string[] fieldsScr = CSVParser.Split(s);

                foreach (var a in aws)
                {
                    string[] fieldsAws = CSVParser.Split(a);

                    if (fieldsScr[0] == fieldsAws[0])
                    {
                        fieldsScr[1] = fieldsAws[1];
                    }
                }
            }

            return scr;
        }

修改 我按照要求在下面添加了一个例子

"原始文件"

ean, skunum, prodname
111, empty, bread
222, empty, cheese

"增强文件"

ean, skunum, prodname
111, 555, foo
333, 444, foo

新"原始文件"

ean,skunum,prodname
111, 555, bread
222, empty, cheese

1 个答案:

答案 0 :(得分:1)

您可以使用Oledb读取csv并加载到数据表中。然后你可以修改表和更新,将结果保存回文件。使用下面的代码

public class CSVReader
    {

        public DataSet ReadCSVFile(string fullPath, bool headerRow)
        {

            string path = fullPath.Substring(0, fullPath.LastIndexOf("\\") + 1);
            string filename = fullPath.Substring(fullPath.LastIndexOf("\\") + 1);
            DataSet ds = new DataSet();

            try
            {
                if (File.Exists(fullPath))
                {
                    string ConStr = string.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0}" + ";Extended Properties=\"Text;HDR={1};FMT=Delimited\\\"", path, headerRow ? "Yes" : "No");
                    string SQL = string.Format("SELECT * FROM {0}", filename);
                    OleDbDataAdapter adapter = new OleDbDataAdapter(SQL, ConStr);
                    adapter.Fill(ds, "TextFile");
                    ds.Tables[0].TableName = "Table1";
                }
                foreach (DataColumn col in ds.Tables["Table1"].Columns)
                {
                    col.ColumnName = col.ColumnName.Replace(" ", "_");
                }
            }

            catch (Exception ex)
            {
                MessageBox.Show(ex.Message);
            }
            return ds;
        }
    }​

要修改两个数据表,请使用linq

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            DataColumn col = null;

            DataTable original = new DataTable();
            col = original.Columns.Add("ean", typeof(int));
            col.AllowDBNull = true;
            col = original.Columns.Add("skunum", typeof(int));
            col.AllowDBNull = true;
            col = original.Columns.Add("prodname", typeof(string));
            col.AllowDBNull = true;

            original.Rows.Add(new object[] {111, null, "bread"});
            original.Rows.Add(new object[] {222, null, "cheese"});

            DataTable enhancement = new DataTable();
            col = enhancement.Columns.Add("ean", typeof(int));
            col.AllowDBNull = true;
            col = enhancement.Columns.Add("skunum", typeof(int));
            col.AllowDBNull = true;
            col = enhancement.Columns.Add("prodname", typeof(string));
            col.AllowDBNull = true;

            enhancement.Rows.Add(new object[] {111, 555, "foo"});
            enhancement.Rows.Add(new object[] {333, 444, "foo"});

            var joinedObject = (from o in original.AsEnumerable()
                                join e in enhancement.AsEnumerable() on o.Field<int>("ean") equals e.Field<int>("ean")
                                select new { original = o, enhancement = e }).ToList();

            foreach (var row in joinedObject)
            {
                row.original["skunum"] = row.enhancement["skunum"];
                row.original["prodname"] = row.enhancement["prodname"];
            }
        }
    }
}
​