Question

我有一个CSV文档，其标题如下所示：

日期，时间，TIRCA-501 [°C]，PIRCA-501 [MPa]，TIRCA-502 [°C]，TIRCA-503 [°C]，TIR-504 [°C]，WTRIA-501 [ ℃]

（实际的csv文件要长得多，但我刚刚删除了相关部分）

这是我用来解析csv文件的实用工具方法：

public static bool TryReadFromCsvFile(string csvFilePath, out DataTable fileContent, bool isFirstRowHeader)
{
    fileContent = new DataTable();
    try
    {
        string header = isFirstRowHeader ? "Yes" : "No";

        string pathOnly = Path.GetDirectoryName(csvFilePath);
        string fileName = Path.GetFileName(csvFilePath);

        string sql = @"SELECT * FROM [" + fileName + "] ";

        using (OleDbConnection connection = new OleDbConnection(
            String.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"{0}\";Extended Properties=\"Text;CharacterSet=65001;ImportMixedTypes=Text;IMEX=1;HDR={1};FMT=Delimited;TypeGuessRows=0\"",pathOnly,header)))
        using (OleDbCommand command = new OleDbCommand(sql, connection))
        using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
        {
            fileContent.Locale = CultureInfo.CurrentCulture;
            adapter.Fill(fileContent);
            return true;
        }
    }
    catch (Exception ex)
    {
        //Logging utility here
        return false;
    }
}

该方法通常可以正常工作，但对于上述数据，方括号'['将被替换为常规括号'（'在解析的最终结果中。'

只是为了证明我没有失去理智，这是证据（使用调试器截图）：

enter image description here

我还检查了原始文件中有问题的方括号的十六进制代码。这是5B，which is clearly denoted as left square bracket in UTF-8.

为什么OLEDB导入导致这种情况？我该如何防止这种行为？

编辑：我意识到存在许多其他解析CSV文件的方法。哎呀，我甚至可以将内容读作字符串列表并用逗号分隔。我只是想了解为什么Oledb会导致这样的问题，所以我可以决定是否全部废弃实用程序方法。我希望看到权威消息来源的回答。

Answer 1

我是Microsoft.VisualBasic.FileIO.TextFieldParser用于csv解析的个人粉丝。添加对Microsoft.VisualBasic的引用。我已将您的标题保存为ANSI编码的csv。

string dataCsv;     
using (var csvReader = new TextFieldParser(
    dataCsv, 
    Encoding.GetEncoding("iso-8859-1"), 
    true))
{
    csvReader.TextFieldType = FieldType.Delimited;
    csvReader.SetDelimiters(",");

    while (!csvReader.EndOfData)
    {
        try
        {
            string[] currentRow = csvReader.ReadFields();
            // turn that to a DataRow
        }
        catch (MalformedLineException ex) { }
    }
    // build the DataTable and add all DataRows 
}

C＃OleDB CSV导入自己转换字符

1 个答案: