正则表达式识别独立数字

时间:2017-04-20 20:27:59

标签: regex excel-vba vba excel

我有一个字符串,可能包含也可能不包含数字。如果有一个数字,它将是独立的,如'3200 Fedex FL'或'10 Downing St'或作为'4th ST NW'或'I96'或'US28'等名称的一部分。我正在寻找一个正则表达式,它将忽略独立数字并给我剩下的字符串,但会将数字作为名称的一部分

尝试

Function getAddress(addr As String)  
Dim allMatches As Object  
Dim RE As Object  
Set RE = CreateObject("vbscript.regexp")  
RE.Pattern = "(\b[\d]+\b)"   
RE.Global = True  
RE.IgnoreCase = True 

Set allMatches = RE.Execute(addr)

If (allMatches.Count <> 0) Then
    result = allMatches.Item(0).submatches.Item(0)
End If
getAddress = result
End Function

示例数据集

0 I64 EB MM 93
1519 KINGSCROSS RD
28 VA288 298
JOHN RANDOLPH RD
4700 WALMSLEY BL
BOWLING GREEN RD / BOB WHITE RD
BRUCE CT / FLORIDA AV
BUCK RD和WHITEHALL RD
DOWNTOWN EWRESSWY EB 2ND ST
HYHLAND VISTA / GEORGE WASHINGTON BL
HYHLAND VISTA / GEORGE WASHINGTON BL
HYHLAND VISTA DR / GEORGE WASHINGTON BL
I95 25 43
LAUARL RIDGE MILL RD / CLARENCE RD
LAUARL RIDGE MILL RD / CLARENCE RD
NOVAH HOWARD ST / SEMINARY RD
旧COUVAHOUSE RD R COUVAHOUSE RD
旧COUVAHOUSE RD R COUVAHOUSE RD
WOODLAND和ROANOKE
1501 SAMS CR1
15281 WHITEHEAD RD
1532 MARLBORO ST
16907 BRANDERS BRIDGE RD
1750 WILLIAM ST

预期产出:
I64 EB MM
KINGSCROSS RD VA288 JOHN RANDOLPH RD
BOWLING GREEN RD / BOB WHITE RD
BRUCE CT / FLORIDA AV
BUCK RD和WHITEHALL RD
DOWNTOWN EWRESSWY EB 2ND ST
HYHLAND VISTA / GEORGE WASHINGTON BL
HYHLAND VISTA / GEORGE WASHINGTON BL
HYHLAND VISTA DR / GEORGE WASHINGTON BL
I95
LAUARL RIDGE MILL RD / CLARENCE RD
LAUARL RIDGE MILL RD / CLARENCE RD
NOVAH HOWARD ST / SEMINARY RD
旧COUVAHOUSE RD R COUVAHOUSE RD
旧COUVAHOUSE RD R COUVAHOUSE RD
WOODLAND和ROANOKE
SAMS CR1
WHITEHEAD RD
MARLBORO ST
BRANDERS BRIDGE RD
WILLIAM ST

5 个答案:

答案 0 :(得分:0)

尝试以下正则表达式:

(?![0-9]+\s).

它查找连续数字后跟空格并否定它们

答案 1 :(得分:0)

匹配独立数字似乎更容易,并用空字符串替换它们。独立数字正则表达式(带有周围空格)是

\s*\b\d+\b\s*

演示:https://regex101.com/r/JuEAQd/1

注意:上面演示中的输出结果有点混乱,因为所有测试字符串都放在一个输入中,但是当逐个进行时,result is {{3 }}

答案 2 :(得分:0)

我相信这是你想要的模式(匹配字母后跟数字+其余的单词或匹配数字后跟字母后跟剩下的单词)

/\b([A-z]+[0-9]+[A-z0-9]*|[0-9]+[A-z]+[A-z0-9]*)\b/gm

并且在行动......

&#13;
&#13;
var pattern = /\b([A-z]+[0-9]+[A-z0-9]*|[0-9]+[A-z]+[A-z0-9]*)\b/gm
var str = "0 I64 EB MM 93\
1519 KINGSCROSS RD\
28 VA288 298\
JOHN RANDOLPH RD\
4700 WALMSLEY BL\
BOWLING GREEN RD / BOB WHITE RD\
BRUCE CT /FLORIDA AV\
BUCK RD AND WHITEHALL RD\
DOWNTOWN EWRESSWY EB 2ND ST\
HYHLAND VISTA / GEORGE WASHINGTON BL\
HYHLAND VISTA / GEORGE WASHINGTON BL\
HYHLAND VISTA DR / GEORGE WASHINGTON BL\
I95 25 43\
LAUARL RIDGE MILL RD /CLARENCE RD\
LAUARL RIDGE MILL RD /CLARENCE RD\
NOVAH HOWARD ST /SEMINARY RD\
OLD COUVAHOUSE RD R COUVAHOUSE RD\
OLD COUVAHOUSE RD R COUVAHOUSE RD\
WOODLAND AND ROANOKE\
1501 SAMS CR1\
15281 WHITEHEAD RD\
1532 MARLBORO ST\
16907 BRANDERS BRIDGE RD\
1750 WILLIAM ST"
console.log(str.match(pattern))
&#13;
&#13;
&#13;

答案 3 :(得分:0)

您没有指定您正在使用的编程语言,但您需要replace 不仅仅匹配,例如:

代表python

import re
new_text = re.sub(r" ?\b\d+\b ?", "", old_text)

代表php

$new_text = preg_replace('/ ?\b\d+\b ?/', '', $old_text);

for vb(.NET):

Dim ResultString As String
Try
    Dim RegexObj As New Regex(" ?\b\d+\b ?")
    ResultString = RegexObj.Replace(SubjectString, "")
Catch ex As ArgumentException
    'Syntax error in the regular expression
End Try

Live Demo

答案 4 :(得分:0)

使用RegExp对象的Replace方法:

RE.Global = True   
RE.Pattern = "\b\d+(\s|$)"
result = RE.Replace(addr, "") ' Remove all matches from string