如何使用OLEDB解析和导入每个单元格都用双引号括起来的CSV文件,因为有些行包含逗号?我无法更改格式,因为它来自供应商。
我正在尝试以下操作,并且失败并出现IO错误:
public DataTable ConvertToDataTable(string fileToImport, string fileDestination)
{
string fullImportPath = fileDestination + @"\" + fileToImport;
OleDbDataAdapter dAdapter = null;
DataTable dTable = null;
try
{
if (!File.Exists(fullImportPath))
return null;
string full = Path.GetFullPath(fullImportPath);
string file = Path.GetFileName(full);
string dir = Path.GetDirectoryName(full);
//create the "database" connection string
string connString = "Provider=Microsoft.Jet.OLEDB.4.0;"
+ "Data Source=\"" + dir + "\\\";"
+ "Extended Properties=\"text;HDR=No;FMT=Delimited\"";
//create the database query
string query = "SELECT * FROM " + file;
//create a DataTable to hold the query results
dTable = new DataTable();
//create an OleDbDataAdapter to execute the query
dAdapter = new OleDbDataAdapter(query, connString);
//fill the DataTable
dAdapter.Fill(dTable);
}
catch (Exception ex)
{
throw new Exception(CLASS_NAME + ".ConvertToDataTable: Caught Exception: " + ex);
}
finally
{
if (dAdapter != null)
dAdapter.Dispose();
}
return dTable;
}
当我使用普通的CSV时,它可以正常工作。我是否需要更改connString中的内容?
答案 0 :(得分:3)
使用专用的CSV解析器。
那里有很多人。一个受欢迎的版本是FileHelpers,但Microsoft.VisualBasic.FileIO
命名空间中隐藏了一个 - TextFieldParser
。
答案 1 :(得分:1)
答案 2 :(得分:1)
您可以使用此代码:MS office required
private void ConvertCSVtoExcel(string filePath = @"E:\nucc_taxonomy_140.csv", string tableName = "TempTaxonomyCodes")
{
string tempPath = System.IO.Path.GetDirectoryName(filePath);
string strConn = @"Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + tempPath + @"\;Extensions=asc,csv,tab,txt";
OdbcConnection conn = new OdbcConnection(strConn);
OdbcDataAdapter da = new OdbcDataAdapter("Select * from " + System.IO.Path.GetFileName(filePath), conn);
DataTable dt = new DataTable();
da.Fill(dt);
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(ConfigurationSettings.AppSettings["dbConnectionString"]))
{
bulkCopy.DestinationTableName = tableName;
bulkCopy.BatchSize = 50;
bulkCopy.WriteToServer(dt);
}
}
答案 3 :(得分:1)
处理CSV文件时需要考虑很多事项。但是,您从文件中提取它们,您应该知道如何处理解析。有些课程可以让你分道扬but,但是大多数课程都没有处理Excel使用嵌入式逗号,引号和换行符所做的细微差别。但是,如果您只想解析像CSV这样的txt文件,那么加载Excel或MS类似乎会带来很多开销。
您可以考虑的一件事是在您自己的Regex中进行解析,这也将使您的代码更加独立于平台,以防您需要在某些时候将其移植到另一台服务器或应用程序。使用正则表达式的好处是几乎每种语言都可以访问。也就是说,有一些很好的正则表达式模式可以处理CSV拼图。这是我的镜头,它包括嵌入的逗号,引号和换行符。正则表达式代码/模式和解释:
希望有所帮助。
答案 4 :(得分:0)
答案 5 :(得分:0)
private static void Mubashir_CSVParser(string s)
{
// extract the fields
Regex RegexCSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
String[] Fields = RegexCSVParser.Split(s);
// clean up the fields (remove " and leading spaces)
for (int i = 0; i < Fields.Length; i++)
{
Fields[i] = Fields[i].TrimStart(' ', '"');
Fields[i] = Fields[i].TrimEnd('"');// this line remove the quotes
//Fields[i] = Fields[i].Trim();
}
}
答案 6 :(得分:-1)
只是因为任何人有类似的问题,我想发布我使用的代码。我最终使用Textparser来获取文件并解析列,但我使用了recneath来完成剩下的工作和子串。
/// <summary>
/// Parses each string passed as a "row".
/// This routine accounts for both double quotes
/// as well as commas currently, but can be added to
/// </summary>
/// <param name="row"> string or row to be parsed</param>
/// <returns></returns>
private List<String> ParseRowToList(String row)
{
List<String> returnValue = new List<String>();
if (row[0] == '\"')
{// Quoted String
if (row.IndexOf("\",") > -1)
{// There are more columns
returnValue = ParseRowToList(row.Substring(row.IndexOf("\",") + 2));
returnValue.Insert(0, row.Substring(1, row.IndexOf("\",") - 1));
}
else
{// This is the last column
returnValue.Add(row.Substring(1, row.Length - 2));
}
}
else
{// Unquoted String
if (row.IndexOf(",") > -1)
{// There are more columns
returnValue = ParseRowToList(row.Substring(row.IndexOf(",") + 1));
returnValue.Insert(0, row.Substring(0, row.IndexOf(",")));
}
else
{// This is the last column
returnValue.Add(row.Substring(0, row.Length));
}
}
return returnValue;
}
然后Textparser的代码是:
// string pathFile = @"C:\TestFTP\TestCatalog.txt";
string pathFile = @"C:\TestFTP\SomeFile.csv";
List<String> stringList = new List<String>();
TextFieldParser fieldParser = null;
DataTable dtable = new DataTable();
/* Set up TextFieldParser
* use the correct delimiter provided
* and path */
fieldParser = new TextFieldParser(pathFile);
/* Set that there are quotes in the file for fields and or column names */
fieldParser.HasFieldsEnclosedInQuotes = true;
/* delimiter by default to be used first */
fieldParser.SetDelimiters(new string[] { "," });
// Build Full table to be imported
dtable = BuildDataTable(fieldParser, dtable);
答案 7 :(得分:-1)
这是我在项目中使用的,解析单行数据。
private string[] csvParser(string csv, char separator = ',')
{
List <string> parsed = new List<string>();
string[] temp = csv.Split(separator);
int counter = 0;
string data = string.Empty;
while (counter < temp.Length)
{
data = temp[counter].Trim();
if (data.Trim().StartsWith("\""))
{
bool isLast = false;
while (!isLast && counter < temp.Length)
{
data += separator.ToString() + temp[counter + 1];
counter++;
isLast = (temp[counter].Trim().EndsWith("\""));
}
}
parsed.Add(data);
counter++;
}
return parsed.ToArray();
}
http://zamirsblog.blogspot.com/2013/09/c-csv-parser-csvparser.html