我想在Excel 2010中使用VBA查询UTF-8编码的CSV文件,并使用以下数据库连接:
provider=Microsoft.Jet.OLEDB.4.0;;data source='xyz';Extended Properties="text;HDR=Yes;FMT=Delimited(,);CharacterSet=65001"
所有CSV文件都以BOM \ xEF \ xBB \ xBF和标题行开头。不知何故,物料清单没有被正确识别,第一列标题被读作"?header_name",即问号前置。我尝试过不同的CharacterSets,我也试过使用Microsoft.ACE.OLEDB.12.0,但到目前为止一切都没有成功。
这是一个已知错误,还是有任何方法可以在不更改源文件编码的情况下获取正确的第一个列标题名称?
答案 0 :(得分:5)
以下过程将整个CSV
文件解压缩为新的Sheet
,从标头中清除BOM
。它将Path,Filename和BOM字符串作为变量来提供灵活性。
使用此过程调用查询过程
Sub Qry_Csv_Utf8()
Const kFile As String = "UTF8 .csv"
Const kPath As String = "D:\StackOverFlow\Temp\"
Const kBOM As String = "\xEF\xBB\xBF"
Call Ado_Qry_Csv(kPath, kFile, kBOM)
End Sub
这是查询程序
Sub Ado_Qry_Csv(sPath As String, sFile As String, sBOM As String)
Dim Wsh As Worksheet
Dim AdoConnect As ADODB.Connection
Dim AdoRcrdSet As ADODB.Recordset
Dim i As Integer
Rem Add New Sheet - Select option required
'With ThisWorkbook 'Use this if procedure is resident in workbook receiving csv data
'With Workbooks(WbkName) 'Use this if procedure is not in workbook receiving csv data
With ActiveWorkbook 'I used this for testing purposes
Set Wsh = .Sheets.Add(After:=.Sheets(.Sheets.Count))
'Wsh.Name = NewSheetName 'rename new Sheet
End With
Set AdoConnect = New ADODB.Connection
AdoConnect.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & sPath & ";" & _
"Extended Properties='text;HDR=Yes;FMT=Delimited(,);CharacterSet=65001'"
Set AdoRcrdSet = New ADODB.Recordset
AdoRcrdSet.Open Source:="SELECT * FROM [" & sFile & "]", _
ActiveConnection:=AdoConnect, _
CursorType:=adOpenDynamic, _
LockType:=adLockReadOnly, _
Options:=adCmdText
Rem Enter Csv Records in Worksheet
For i = 0 To -1 + AdoRcrdSet.Fields.Count
Wsh.Cells(1, 1 + i).Value = _
WorksheetFunction.Substitute(AdoRcrdSet.Fields(i).Name, sBOM, "")
Next
Wsh.Cells(2, 1).CopyFromRecordset AdoRcrdSet
End Sub