使用RegEx和Replace在MS-Access中使用USPS缩写更新地址字段

时间:2012-05-29 09:07:24

标签: regex ms-access vba

我正在尝试在Access中编写一个VBA函数,用标准United States Postal Abbreviations替换地址字段中的单词。我意识到这永远不会是完美的,但我想至少制作简单的缩写(无需购买地址格式化软件),例如

 input      output
 -------    -------------
 North   -> N
 Street  -> ST
 Drive   -> DR
 Lane    -> LN

我考虑使用一个简单的表来存储字符串和替换字符串,然后循环遍历该表/记录集以执行简单的搜索并使用Replace()函数替换例如使用immediate window

 ?Replace("123 North 3rd St", "North", "N", compare:=vbTextCompare)
 123 N 3rd St

但是,此方法可能会导致错误,例如

 ?Replace("123 Northampton St", "North", "N", compare:=vbTextCompare)
 123 Nampton St

我最初的策略是使用正则表达式模式和替换字符串创建替换表,然后遍历该表以进行更精确的搜索和替换。

pattern                 abbrev
-------------------     ------------
{pattern for North}     N
{pattern for Street}    ST

我意识到RegEx在这里可能有些过分,特别是因为我要在数据库中反复遍历地址字段,但是想到使用Replace()函数时更容易想到的方法( 更新:查看来自@ mwolfe02和@Cylian的回复,以及混合解决方案)。

在上面的示例中,我想搜索单词North和Street,它们或者作为字符串中的单词存在(因此由两个空格分隔)或者在字符串的结尾处或字符串的开头处。这涵盖了大多数需要缩写的情况。 例如

address                       formatted
----------------------        --------------------------
123 North 3rd St           -> 123 N 3RD ST
123 ABC Street North       -> 123 ABC ST N
North 3rd Street           -> N 3RD ST
123 North Northampton St   -> 123 N NORTHAMPTON ST

在这些示例中,我想替换字符串中模式的所有实例。我也将一切都转换为大写(我可以在最终结果上使用UCase()没问题)。

有没有人知道这样做的现有模块?任何人都可以帮助模式匹配,如上例所示?为了额外的功劳,我很想知道在表格中创建规则以格式化邮政信箱,例如

address                   formatted
----------------------    --------------------------
P.O. Box 345           -> PO BOX 345
PO Box 345             -> PO BOX 345
Post Office Box 345    -> PO BOX 345
PO. Box 345            -> PO BOX 345
P. O. Box 345          -> PO BOX 345

This stack overflow post给出了以下模式来识别某些PO框“^ \ s * P。?\ s?O。?\ sB [Oo] [Xx]。” (诚​​然不是上面的第三个例子)。同样,我对匹配和替换集合感到不舒服,以找出如何编写这种更精确的替换功能。是否有可以提供帮助的RegEx / Access专家?

4 个答案:

答案 0 :(得分:5)

尝试此功能

Public Function FormatPO(inputString$)
'This example uses **Microsoft VBScript Regular Expressions 5.5**
Dim re As New RegExp, result$
With re
    .Pattern = "\bP(?:[. ]+|ost +)?O(?:ff\.?(?:ice))?[. ]+B(?:ox|\.) +(\d+)\b"
    .Global = True
    .IgnoreCase = True
    If .test(inputString) Then
        FormatPO = .Replace(inputString, "PO BOX $1")
    Else
        MsgBox "Data doesn't matched!"
    End If
End With

可以被称为(来自immediate window

?FormatPO("P.O. Box 563")

给出结果

PO BOX 563

带地址的街道名称的匹配模式需要更多时间来构建。但您可以访问here并在线构建您的RegEx。

希望这有帮助。

答案 1 :(得分:2)

@Cylian对你问题的第二部分有一个很好的答案。我会尝试解决第一个问题。如果您唯一关心的是替换地址中的整个单词,那么以下函数将满足您的需求:

Function AddressReplace(AddressLine As String, _
                        FullName As String, _
                        Abbrev As String)
    AddressReplace = Trim(Replace(" " & AddressLine & " ", _
                                  " " & FullName & " ", _
                                  " " & Abbrev & " "))
End Function

它将地址行包含在打开和关闭空间中,这样您就可以在要替换的每个单词上要求打开和关闭空格。它完成了修剪以摆脱那些临时空间。

以下过程测试代码并生成您要查找的输出:

Sub TestAddressReplace()
    Debug.Print AddressReplace("123 North 3rd St", "North", "N")
    Debug.Print AddressReplace("123 Northampton St", "North", "N")
End Sub

答案 2 :(得分:2)

USPS有一个免费的查找API来验证和标准化地址。您需要注册该服务(快速),然后在API中使用您的ID /密码来反弹他们的网站。是否所有工作都适合您,并提供示例代码。加拿大邮政服务也有同样的事情(不确定它是免费的)。

https://www.usps.com/business/web-tools-apis/welcome.htm

B中。塞维尔

答案 3 :(得分:0)

我从USPS在线缩写列表中创建了一个非常简单的参考表 ref_USPS_abbrev 。这是与最初给出的示例相对应的条目:

WORD          ABBREV
------------  -------------
NORTH         N
STREET        ST

然后,结合对原始帖子的回复,我创建了两个辅助函数。

来自@Cylian:

    ' ----------------------------------------------------------------------'
    '  Formats string containing P.O. Box to USPS Approved PO BOX format    '
    ' ----------------------------------------------------------------------'
    '  Requires Microsoft VBScript Regular Expressions 5.5

    Public Function FormatPO(inputString As String) As String

        Static rePO As Object
        If rePO Is Nothing Then
            Set rePO = CreateObject("vbscript.regexp")
        With rePO
        .Pattern = "\bP(?:[. ]+|ost +)?O(?:ff\.?(?:ice))" & _
                   "?[. ]+B(?:ox|\.) +(\d+)\b"
        .Global = True
        .IgnoreCase = True
        End With
        End If

        With rePO
           If .Test(inputString) Then
              FormatPO = .Replace(inputString, "PO BOX $1")
           Else
              FormatPO = inputString
           End If
        End With
    End Function

并且,使用@ mwolfe02的优秀想法:

    ' ----------------------------------------------------------------------'
    '  Replaces whole word only with an abbreviation in address string      '
    ' ----------------------------------------------------------------------'

    Public Function AddressReplace(AddressLine As String, _
                    FullName As String, _
                    Abbrev As String)

    'Enclose address line in an opening and closing space, so that you 
    'can require an opening and closing space on each word you are trying 
    'to replace. Finish up with a trim to get rid of those temporary spaces.

    AddressReplace = Trim(Replace(" " & AddressLine & " ", _
                              " " & FullName & " ", _
                              " " & Abbrev & " "))
    End Function

然后,结合这些辅助函数,我写了这个函数:

' ----------------------------------------------------------------------'
'  Format address using abbreviations stored in table ref_USPS_abbrev   '
' ----------------------------------------------------------------------'  
'  Requires Microsoft DAO 3.6 Object Library
'  Table ref_USPS_abbrev has two fields: WORD (containing the word to match) 
'  and ABBREV containing the desired abbreviated substitution.
'  United States Postal Services abbreviations are available at:
'  https://www.usps.com/ship/official-abbreviations.htm

Public Function SubstituteUSPS(address As String) As String

Static dba As DAO.Database
Static rst_abbrev As DAO.Recordset

    If IsNull(address) Then Exit Function

    'Initialize the objects 

    If dba Is Nothing Then
        Set dba = CurrentDb
    End If

    'Create the rst_abbrev recordset once from ref_USPS_abbrev. If additional
    'entries are added to the source ref_USPS_abbrev table after the recordset 
    'is created, since it is an dbOpenTable (by default), the recordset will 
    'be updated dynamically. If you use dbOpenSnapshot it will not update 
    'dynamically.

    If rst_abbrev Is Nothing Then
        Set rst_abbrev = dba.OpenRecordset("ref_USPS_abbrev",  _
                                           Type:=dbOpenTable)
    End If

    'Since rst_abbrev is a static object, in the event the function is called 
    'in succession (e.g. while looping through a recordset to update values), 
    'move to the first entry in the recordset each time the function is 
    'called.

    rst_abbrev.MoveFirst

    'Only call the FormatPO helper function if the address has the 
    'string "ox" in it.    

    If InStr(address, "ox") > 0 Then
        address = FormatPO(address)
    End If

    'Loop through the recordset containing the abbreviations
    'and use the AddressReplace helper function to substitute 
    'abbreviations for whole words only.

    Do Until rst_abbrev.EOF
        address = AddressReplace(address, rst_abbrev![WORD],  _
                                 rst_abbrev![ABBREV])
        rst_abbrev.MoveNext
    Loop

    'Convert the address to upper case and trim white spaces and return result
    'You can also add code here to trim out punctuation in the address, too.

    SubstituteUSPS = Trim(UCase(address))

End Function

创建 ref_USPS_abbrev 表进行测试:

Sub CreateUSPSTable()

Dim dbs As Database
Set dbs = CurrentDb

With dbs
    .Execute "CREATE TABLE ref_USPS_abbrev " _
        & "(WORD CHAR, ABBREV CHAR);"
    .Execute " INSERT INTO ref_USPS_abbrev " _
        & "(WORD, ABBREV) VALUES " _
        & "('NORTH', 'N');"
    .Execute " INSERT INTO ref_USPS_abbrev " _
        & "(WORD, ABBREV) VALUES " _
        & "('STREET', 'ST');"
    .Close
End With
End Sub

最后,从immediate window

测试此功能
 CreateUSPSTable
 ?SubstituteUSPS("Post Office Box 345 123 North Northampton Street")
 PO BOX 345 123 N NORTHAMPTON ST

我不是专业的程序员,所以我欢迎进一步清理我的代码的建议,但是现在这很有用。感谢大家。

Stack Overflow又一次FTW!