如何确定文本是否为西里尔字符?

时间:2008-10-15 22:11:03

标签: unicode outlook outlook-vba

我的垃圾邮件文件夹已经填满了看似西里尔字母组成的邮件。如果邮件正文或邮件主题是西里尔文,我想永久删除它。

在我的屏幕上,我看到了西里尔字符,但是当我在Outlook中迭代VBA中的邮件时,邮件的“主题”属性会返回问号。

如何判断邮件主题是否为西里尔字符?

(注意:我已经检查了“InternetCodepage”属性 - 它通常是西欧的。)

3 个答案:

答案 0 :(得分:3)

VB / VBA中的String数据类型可以处理Unicode字符,但IDE本身无法显示它们(因此出现问号)。

我写了一个可以帮助你的IsCyrillic函数。该函数采用单个String参数,如果字符串包含至少一个西里尔字符,则返回True。我使用Outlook 2007测试了此代码,它似乎工作正常。为了测试它,我在主题行中发送了一些带有西里尔文本的电子邮件,并验证我的测试代码可以正确地从我收件箱中的其他所有电子邮件中挑选出来。

所以,我实际上有两个代码片段:

  • 包含IsCyrillic函数的代码。这可以复制粘贴 进入新的VBA模块或添加到 你已有的代码。
  • 我编写的Test例程(在Outlook VBA中)来测试代码是否真正有效。它演示了如何使用IsCyrillic函数。

守则

Option Explicit

Public Const errInvalidArgument = 5

' Returns True if sText contains at least one Cyrillic character'
' NOTE: Assumes UTF-16 encoding'

Public Function IsCyrillic(ByVal sText As String) As Boolean

    Dim i As Long

    ' Loop through each char. If we hit a Cryrillic char, return True.'

    For i = 1 To Len(sText)

        If IsCharCyrillic(Mid(sText, i, 1)) Then
            IsCyrillic = True
            Exit Function
        End If

    Next

End Function

' Returns True if the given character is part of the Cyrillic alphabet'
' NOTE: Assumes UTF-16 encoding'

Private Function IsCharCyrillic(ByVal sChar As String) As Boolean

    ' According to the first few Google pages I found, '
    ' Cyrillic is stored at U+400-U+52f                '

    Const CYRILLIC_START As Integer = &H400
    Const CYRILLIC_END  As Integer = &H52F

    ' A (valid) single Unicode char will be two bytes long'

    If LenB(sChar) <> 2 Then
        Err.Raise errInvalidArgument, _
            "IsCharCyrillic", _
            "sChar must be a single Unicode character"
    End If

    ' Get Unicode value of character'

    Dim nCharCode As Integer
    nCharCode = AscW(sChar)

    ' Is char code in the range of the Cyrillic characters?'

    If (nCharCode >= CYRILLIC_START And nCharCode <= CYRILLIC_END) Then
        IsCharCyrillic = True
    End If

End Function


使用示例

' On my box, this code iterates through my Inbox. On your machine,'
' you may have to switch to your Inbox in Outlook before running this code.'
' I placed this code in `ThisOutlookSession` in the VBA editor. I called'
' it in the Immediate window by typing `ThisOutlookSession.TestIsCyrillic`'

Public Sub TestIsCyrillic()

    Dim oItem As Object
    Dim oMailItem As MailItem

    For Each oItem In ThisOutlookSession.ActiveExplorer.CurrentFolder.Items

        If TypeOf oItem Is MailItem Then

            Set oMailItem = oItem

            If IsCyrillic(oMailItem.Subject) Then

                ' I just printed out the offending subject line '
                ' (it will display as ? marks, but I just       '
                ' wanted to see it output something)            '
                ' In your case, you could change this line to:  '
                '                                               '
                '     oMailItem.Delete                          '
                '                                               '
                ' to actually delete the message                '

                Debug.Print oMailItem.Subject

            End If

        End If

    Next

End Sub

答案 1 :(得分:0)

  

消息的“Subject”属性返回一堆问号。

经典的字符串编码问题。听起来这个属性正在返回ASCII,但你需要UTF-8或Unicode。

答案 2 :(得分:0)

在我看来,你已经有了一个简单的解决方案 - 只需查找任何带有(例如)5个问号的主题行