如何从csv文件中获取某些单元格包含逗号的值?

时间:2016-07-19 20:13:54

标签: c# excel file csv xls

我有一个脚本导入csv文件并读取每一行以更新Sitecore中的相应项目。它适用于许多产品,但问题出在某些产品中,行中的某些单元格中有逗号(例如产品说明)。

protected void SubmitButton_Click(object sender, EventArgs e)
{
    if (UpdateFile.PostedFile != null)
    {
        var file = UpdateFile.PostedFile;

        // check if valid csv file

        message.InnerText = "Updating...";

        Sitecore.Context.SetActiveSite("backedbybayer");
        _database = Database.GetDatabase("master");
        SitecoreContext context = new SitecoreContext(_database);
        Item homeNode = context.GetHomeItem<Item>();


        var productsItems =
            homeNode.Axes.GetDescendants()
                .Where(
                    child =>
                        child.TemplateID == new ID(TemplateFactory.FindTemplateId<IProductDetailPageItem>()));

        try
        {
            using (StreamReader sr = new StreamReader(file.InputStream))
            {
                var firstLine = true;
                string currentLine;
                var productIdIndex = 0;
                var industryIdIndex = 0;
                var categoryIdIndex = 0;
                var pestIdIndex = 0;
                var titleIndex = 0;
                string title;
                string productId;
                string categoryIds;
                string industryIds;
                while ((currentLine = sr.ReadLine()) != null)
                {
                    var data = currentLine.Split(',').ToList();
                    if (firstLine)
                    {
                        // find index of the important columns
                        productIdIndex = data.IndexOf("ProductId");
                        industryIdIndex = data.IndexOf("PrimaryIndustryId");
                        categoryIdIndex = data.IndexOf("PrimaryCategoryId");
                        titleIndex = data.IndexOf("Title");
                        firstLine = false;
                        continue;
                    }

                    title = data[titleIndex];
                    productId = data[productIdIndex];
                    categoryIds = data[categoryIdIndex];
                    industryIds = data[industryIdIndex];

                    var products = productsItems.Where(x => x.DisplayName == title);
                    foreach (var product in products)
                    {
                        product.Editing.BeginEdit();
                        try
                        {
                            product.Fields["Product Id"].Value = productId;
                            product.Fields["Product Industry Ids"].Value = industryIds;
                            product.Fields["Category Ids"].Value = categoryIds;
                        }
                        finally
                        {
                            product.Editing.EndEdit();
                        }
                    }
                }
            }

            // when done
            message.InnerText = "Complete";
        }
        catch (Exception ex)
        {
            message.InnerText = "Error reading file";
        }             
    }
}

问题在于,当描述字段中有逗号时,例如&#34;产品是一种有效的,预防性的生物杀菌剂,&#34;它也会被拆分并抛弃索引,因此categoryIds = data[8]得到错误的值。

电子表格是我们客户提供的数据,因此除非必要,否则我宁愿不要求客户编辑该文件。有没有办法在我的代码中处理这个问题?有没有不同的方法我可以阅读不会用逗号分割所有内容的文件?

1 个答案:

答案 0 :(得分:0)

我建议使用Ado.Net,如果字段的数据在引号内,它会像字段一样解析它并忽略其中的任何逗号..

代码示例:

static DataTable GetDataTableFromCsv(string path, bool isFirstRowHeader)
{
    string header = isFirstRowHeader ? "Yes" : "No";

    string pathOnly = Path.GetDirectoryName(path);
    string fileName = Path.GetFileName(path);

    string sql = @"SELECT * FROM [" + fileName + "]";

    using(OleDbConnection connection = new OleDbConnection(
              @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + pathOnly + 
              ";Extended Properties=\"Text;HDR=" + header + "\""))
    using(OleDbCommand command = new OleDbCommand(sql, connection))
    using(OleDbDataAdapter adapter = new OleDbDataAdapter(command))
    {
        DataTable dataTable = new DataTable();
        dataTable.Locale = CultureInfo.CurrentCulture;
        adapter.Fill(dataTable);
        return dataTable;
    }
}