使用JET OleDB,如何读取文件名中包含变音符号的CSV文件?
以下示例尝试读取两个不同的csv文件(böse.csv和good.csv)。两个文件的内容相同,但文件名不同。
首先为两个文件设置schema.ini文件:
private static readonly string[] columns = {"Column1", "Column2", "Column3", "Column4"};
const string workingFolder = "D:\\temp\\";
const string bad = "böse.csv";
const string good = "good.csv";
private static void CreateSchemaIni()
{
StreamWriter iniOut = new StreamWriter(Path.Combine(workingFolder, "schema.ini"), false);
iniOut.WriteLine("[" + bad + "]");
iniOut.WriteLine("Format=Delimited(;)");
iniOut.WriteLine("CharacterSet=65001"); // hidden feature: setzen der Codepage auf UTF-8
for (int columnId = 0; columnId < columns.Length; columnId++)
{
// Memo statt Text, damit bei 255 Zeichen nicht abgeschnitten wird
iniOut.WriteLine($"Col{columnId + 1}={columns[columnId]} Memo");
}
iniOut.WriteLine("[" + good + "]");
iniOut.WriteLine("Format=Delimited(;)");
iniOut.WriteLine("CharacterSet=65001"); // hidden feature: setzen der Codepage auf UTF-8
for (int columnId = 0; columnId < columns.Length; columnId++)
{
// Memo statt Text, damit bei 255 Zeichen nicht abgeschnitten wird
iniOut.WriteLine($"Col{columnId + 1}={columns[columnId]} Memo");
}
iniOut.Close();
}
一旦创建了模式,就可以使用JET OleDB读取csv文件,只需将正确的文件名放在连接字符串中:
const string connectionString =
"Provider=Microsoft.Jet.OLEDB.4.0; Data Source={0}; Extended Properties=\"text;HDR=Yes;FMT=Delimited\"";
private static DataTable ReadCsv(string fileName)
{
string connString = string.Format(connectionString, workingFolder);
string cmdString = string.Format("SELECT * FROM [{0}]", fileName);
DataSet dataSet;
using (OleDbDataAdapter dataAdapter = new OleDbDataAdapter(cmdString, connString))
{
dataSet = new DataSet();
dataAdapter.Fill(dataSet);
}
return dataSet.Tables[0];
}
我希望两个文件的表包含相同的数据,但事实并非如此。 “bad.csv”返回一个只有1列的表,读取“good.csv”会返回一个包含预期4列的表。
private static void Dump(DataTable table)
{
int rowId = 0;
foreach (DataRow row in table.Rows)
{
Console.WriteLine($"Row {rowId++}:");
foreach (DataColumn column in table.Columns)
{
Console.WriteLine($"\t{column.ColumnName}={row[column]}");
}
}
}
böse.csv的输出:
Row 0:
Column1; Column2; Column3; Column4 = 1; 2; 3; 4
Row 1:
Column1; Column2; Column3; Column4 = 5; 6; 7; 8
good.csv的输出:
Row0:
Column1 = 1
Column2 = 2
Column3 = 3
Column4 = 4
Row1:
Column1 = 5
Column2 = 6
Column3 = 7
Column4 = 8
是否有可能以任何方式读取带有变音符号的csv文件?