我需要从Access记录集创建一个ANSI文本文件,该记录集输出到JSON和YAML。我可以写文件,但输出是原始字符,我需要逃避它们。例如,变音符号O(ö)应为“\ u00f6”。
我认为将文件编码为UTF-8会起作用,但事实并非如此。但是,再次查看文件编码,如果您编写“没有BOM的UTF-8”,那么一切正常。
有谁知道如何
a)在没有BOM的情况下将文本写为UTF-8,或者 b)用ANSI写入但是转义非ASCII字符?
Public Sub testoutput()
Set db = CurrentDb()
str_filename = "anothertest.json"
MyFile = CurrentProject.Path & "\" & str_filename
str_temp = "Hello world here is an ö"
fnum = FreeFile
Open MyFile For Output As fnum
Print #fnum, str_temp
Close #fnum
End Sub
答案 0 :(得分:6)
...确定....我找到了一些关于如何删除BOM的示例代码。我本以为在实际编写文本时可以更优雅地做到这一点。没关系。以下代码删除了BOM。
(这最初由Simon Pedersen在http://www.imagemagick.org/discourse-server/viewtopic.php?f=8&t=12705发布)
' Removes the Byte Order Mark - BOM from a text file with UTF-8 encoding
' The BOM defines that the file was stored with an UTF-8 encoding.
Public Function RemoveBOM(filePath)
' Create a reader and a writer
Dim writer, reader, fileSize
Set writer = CreateObject("Adodb.Stream")
Set reader = CreateObject("Adodb.Stream")
' Load from the text file we just wrote
reader.Open
reader.LoadFromFile filePath
' Copy all data from reader to writer, except the BOM
writer.Mode = 3
writer.Type = 1
writer.Open
reader.Position = 5
reader.CopyTo writer, -1
' Overwrite file
writer.SaveToFile filePath, 2
' Return file name
RemoveBOM = filePath
' Kill objects
Set writer = Nothing
Set reader = Nothing
End Function
对其他人可能有用。
答案 1 :(得分:1)
这里游戏的后期,但我不能成为唯一一个厌倦了我的SQL导入的编码器被带有字节顺序标记的文本文件打破了。很少有关于这个问题的Stack问题 - 这是最接近的问题 - 所以我在这里发布重叠的答案。
我说'重叠'因为下面的代码解决了一个稍微不同的问题 - 主要目的是为具有异构文件集合的文件夹编写Schema文件 - 但BOM处理段明确标记。
关键功能是我们遍历所有' .csv'文件夹中的文件,我们用前四个字节的快速半字节测试每个文件:如果我们看到一个,我们只会删除字节顺序标记。
之后,我们正在处理来自原始C的低级文件处理代码。我们必须一直使用字节数组,因为你在VBA中做的其他事情都将存放嵌入字符串变量结构中的字节顺序标记。
所以,没有进一步的adodb,这里是代码:
Public Sub SetSchema(strFolder As String) On Error Resume Next
' Write a Schema.ini file to the data folder.
' This is necessary if we do not have the registry privileges to set the ' correct 'ImportMixedTypes=Text' registry value, which overrides IMEX=1
' The code also checks for ANSI or UTF-8 and UTF-16 files, and applies a ' usable setting for CharacterSet ( UNICODE|ANSI ) with a horrible hack.
' OEM codepage-defined text is not supported: further coding is required
' ...And we strip out Byte Order Markers, if we see them - the OLEDB SQL ' provider for textfiles can't deal with a BOM in a UTF-16 or UTF-8 file
' Not implemented: handling tab-delimited files or other delimiters. The ' code assumes a header row with columns, specifies 'scan all rows', and ' imposes 'read the column as text' if the data types are mixed.
Dim strSchema As String Dim strFile As String Dim hndFile As Long Dim arrFile() As Byte Dim arrBytes(0 To 4) As Byte
If Right(strFolder, 1) <> "\" Then strFolder = strFolder & "\"
' Dir() is an iterator function when you call it with a wildcard:
strFile = VBA.FileSystem.Dir(strFolder & "*.csv")
Do While Len(strFile) > 0
hndFile = FreeFile Open strFolder & strFile For Binary As #hndFile Get #hndFile, , arrBytes Close #hndFile
strSchema = strSchema & "[" & strFile & "]" & vbCrLf strSchema = strSchema & "Format=CSVDelimited" & vbCrLf strSchema = strSchema & "ImportMixedTypes=Text" & vbCrLf strSchema = strSchema & "MaxScanRows=0" & vbCrLf
If arrBytes(2) = 0 Or arrBytes(3) = 0 Then ' this is a hack strSchema = strSchema & "CharacterSet=UNICODE" & vbCrLf Else strSchema = strSchema & "CharacterSet=ANSI" & vbCrLf End If
strSchema = strSchema & "ColNameHeader = True" & vbCrLf strSchema = strSchema & vbCrLf
' BOM disposal - Byte order marks confuse OLEDB text drivers:
If arrBytes(0) = &HFE And arrBytes(1) = &HFF _ Or arrBytes(0) = &HFF And arrBytes(1) = &HFE Then
hndFile = FreeFile Open strFolder & strFile For Binary As #hndFile ReDim arrFile(0 To LOF(hndFile) - 1) Get #hndFile, , arrFile Close #hndFile
BigReplace arrFile, arrBytes(0) & arrBytes(1), ""
hndFile = FreeFile Open strFolder & strFile For Binary As #hndFile Put #hndFile, , arrFile Close #hndFile Erase arrFile
ElseIf arrBytes(0) = &HEF And arrBytes(1) = &HBB And arrBytes(2) = &HBF Then
hndFile = FreeFile Open strFolder & strFile For Binary As #hndFile ReDim arrFile(0 To LOF(hndFile) - 1) Get #hndFile, , arrFile Close #hndFile BigReplace arrFile, arrBytes(0) & arrBytes(1) & arrBytes(2), ""
hndFile = FreeFile Open strFolder & strFile For Binary As #hndFile Put #hndFile, , arrFile Close #hndFile Erase arrFile
End If
strFile = "" strFile = Dir
Loop
If Len(strSchema) > 0 Then
strFile = strFolder & "Schema.ini"
hndFile = FreeFile Open strFile For Binary As #hndFile Put #hndFile, , strSchema Close #hndFile
End If
End Sub
Public Sub BigReplace(ByRef arrBytes() As Byte, ByRef SearchFor As String, ByRef ReplaceWith As String) On Error Resume Next
Dim varSplit As Variant
varSplit = Split(arrBytes, SearchFor) arrBytes = Join$(varSplit, ReplaceWith)
Erase varSplit
End Sub
如果您知道可以将字节数组分配给VBA.String,则代码更容易理解,反之亦然。 BigReplace()函数是一个黑客,可以回避一些VBA的低效字符串处理,特别是分配:你会发现大文件会导致严重的内存和性能问题,如果你这样做的话。