如何将Unicode字符保存到文本文件

时间:2013-09-19 21:49:40

标签: unicode character-encoding ms-word word-vba-mac

这是在Word for MAC VBA中。我想将Unicode字符从文本框保存到文本文件。例如,这个字符“⅛”。

我使用此代码。

Dim N as Long
N = FreeFile
Dim strText as String
strText = Textbox1.Text 'This is what is in the textbox "⅛"
Open <file path> For Output As N 
     Print #N, strText
Close N

它不保存Unicode字符。我知道我必须改变文本编码格式。我该怎么做?

同样,如何使用Unicode格式读取文本文件?

4 个答案:

答案 0 :(得分:7)

我希望这也适用于Mac上的VBA for Word,但在Windows上我有FileSystemObject的CreateTextFile方法(参见MSDN doc)。在那里,我可以定义创建一个unicode文本文件。

  Set fsObject = CreateObject("Scripting.FileSystemObject")
  Set xmlFile = fsObject.CreateTextFile("path/filename.txt", True, True) 'the second "true" forces a unicode file.

  xmlFile.write "YourUnicodeTextHere"
  xmlFile.close

答案 1 :(得分:1)

VBA无法以这种方式编码UTF-8中的文本。使用ADODB - 是的,用于文本,而不是用于数据库。

'ensure reference is set to Microsoft ActiveX DataObjects library
'(the latest version of it) under "tools/references"
Sub AdoTest()
    Dim adoStream As ADODB.Stream

    Set adoStream = New ADODB.Stream

    'Unicode coding
    adoStream.Charset = "Unicode" 'or any string listed in registry HKEY_CLASSES_ROOT\MIME\Database\Charset

    'open sream
    adoStream.Open

    'write a text
    adoStream.WriteText "Text for testing: ěšč", StreamWriteEnum.stWriteLine

    'save to file
    adoStream.SaveToFile "D:\a\ado.txt"

    adoStream.Close
End Sub

阅读更简单,请参阅我的答案:

Unicode and UTF-8 with VBA

已编辑:我已插入完整示例。

编辑2:添加了对注册表中编码列表的引用

答案 2 :(得分:1)

问题是在Mac上使用VBA,恐怕所有答案都无法在Mac上使用。

问题是关于 Unicode 的问题,其中有很多。我将介绍它的UTF-16方面。 UTF-8遵循不同的路径,但也不难。 AFAIU,您的问题是有关UTF-16字符串的。

下面的代码没有错误处理,我会让您解决。

Function writeUnicodeTextToFile(filePathName As String, myText As String)

`Dim myFileNumber As Long, I As Long, byteArray() As Byte

myFileNumber = FreeFile()
Open filePathName For Binary As #myFileNumber

ReDim byteArray(1)

' Create a BOM for your Unicode flavour
' (CHOOSE! one of the two, programmatically, or hard-code it)
 ' => Little Endian
    byteArray(0) = 255: byteArray(1) = 254
' => Big Endian
    'byteArray(0) = 254: byteArray(1) = 255

' now write the two-byte BOM
Put myFileNumber, 1, byteArray

' redimension your byte array
' note it works even if you don't Redim (go figure) but it's more elegant
I = (LenB(myText) / 2) - 1
ReDim byteArray(I)

' populate the byte array...
byteArray = myText

' ... and write you text AFTER the BOM
Put myFileNumber, 3, byteArray
Close #myFileNumber
End Function

答案 3 :(得分:1)

这是一个 VBA 例程,它将一个字符串作为输入(您的文本),并填充一个字节数组。然后以二进制模式将该数组写入磁盘,确保在前三个字节 (BOM) 之后开始写入。

您将需要这些公共变量: byteArray() As Byte, regexUTF8 As String

子测试()

' 创建 BOM

Dim bom(2) As Byte, someFile As Long

bom(0) = 239: bom(1) = 187: bom(2) = 191

' 将一些东西写成 utf-8 UTF16toUTF8 "L'élève de l'école"

someFile = FreeFile() 打开“MacDisk:test.txt”作为二进制文件#someFile ' 首先,BOM 把 #someFile, 1, bom ' 然后是 utf-8 文本 把 #someFile, 4, byteArray1 关闭#someFile 结束子

Sub UTF16toUTF8(theString As String)

' 作者:伊夫·商博良 ' 将 VB/VBA 字符串(它们都是 16 位)转换为 byteArray1,符合 utf-8 标准

    If isStringUTF8(theString) Then Exit Sub

    Dim iLoop As Long, i As Long, k As Long

    k = 0
    ReDim byteArray1(Len(theString) * 4)
    For iLoop = 1 To Len(theString)
        i = AscW(Mid$(theString, iLoop, 1))
        If i < 0 Then i = i + 65536
        If i > -1 And i < 128 Then
            byteArray1(k) = i
            k = k + 1
        ElseIf i >= 128 And i < 2048 Then
            byteArray1(k) = (i \ 64) Or 192
            byteArray1(k + 1) = (i And 63) Or 128
            k = k + 2
        ElseIf i >= 2048 And i < 65536 Then
            byteArray1(k) = (i \ 4096) Or 224
            byteArray1(k + 1) = ((i \ 64) And 63) Or 128
            byteArray1(k + 2) = (i And 63) Or 128
            k = k + 3
        Else
            byteArray1(k) = (i \ 262144) Or 240
            byteArray1(k + 1) = (((i \ 4096) And 63)) Or 128
            byteArray1(k + 2) = ((i \ 64) And 63) Or 128
            byteArray1(k + 3) = (i And 63) Or 128
            k = k + 4
        End If
    Next

    ReDim Preserve byteArray1(k - 1)

End Sub

函数 isStringUTF8(theString As String) As Boolean

    Dim i As Integer, j As Integer, k As Integer

    ' Prime the regex argument
    If Len(regexUTF8) <> 66 Then
        regexUTF8 = "*[" + Space$(62) + "]*"
        For i = 192 To 253
            Mid(regexUTF8, i - 189, 1) = Chr(i)
        Next
    End If

    ' First quick check: any escaping characters?
    If Not theString Like regexUTF8 Then Exit Function

    'longer check: are escaping characters followed by UTF-8 sequences?
    For i = 1 To Len(theString) - 3
        If Asc(Mid(theString, i, 1)) > 192 Then
            k = Asc(Mid(theString, i, 1))
            If k > 193 And k < 220 Then
                If (Asc(Mid(theString, i + 1, 1)) And 128) Then
                    isStringUTF8 = True
                    Exit Function
                End If
            End If
            If k > 223 Then
                If (Asc(Mid(theString, i + 1, 1)) And 128) And (Asc(Mid(theString, i + 2, 1)) And 128) Then
                    isStringUTF8 = True
                    Exit Function
                End If
            End If
            j = j + 1
            If j > 100 Then Exit For
        End If
    Next
End Function