我想从文本中创建一个干净的URL:
Alpha测试'购买Berta Global Associates(C)
网址应如下所示:
的α-测试 - 购的-贝塔全局缔-C
目前我在Excel中使用此公式:
=LOWER(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A38;"--";"-");" / ";"-");" ";"-");": ";"-");" - ";"-");"_";"-");"?";"");",";"");".";"");"'";"");")";"");"(";"");":";"");" ";"-");"&";"and");"!";"");"/";"-");"""";""))
但是,我似乎并没有抓住所有特殊符号等,因此我的网址并不像我希望的那样干净。
您是否知道Excel公式或VBA代码,以确保所有特殊符号都正确转换为干净的URL?
谢谢。
答案 0 :(得分:2)
我可以建议您可以将以下函数放入VBA模块并使用常规公式:
Function NormalizeToUrl(cell As Range)
Dim strPattern As String
Dim regEx As Object
Set regEx = CreateObject("vbscript.regexp")
strPattern = "[^\w-]+"
With regEx
.Global = True
.Pattern = strPattern
End With
NormalizeToUrl = LCase(regEx.Replace(Replace(cell.Value, " ", "-"), ""))
End Function
关键是我们在开头用连字符替换所有空格,然后使用匹配任何非单词和非连字符的正则表达式,并用RegExp.Replace
删除它们。
<强>更新强>:
发表评论后,目前还不清楚你想用Unicode字母做什么。删除或替换为连字符。这是我尝试从您的公式重建的函数,但逻辑可能存在缺陷。我更喜欢上面的通用方法。
Function NormalizeToUrl(cell As Range)
Dim strPattern As String
Dim regEx As Object
Set regEx = CreateObject("vbscript.regexp")
strPattern = "[^\w -]"
With regEx
.Global = True
.Pattern = "[?,.')(:!""]+" ' THESE ARE REMOVED
End With
NormalizeToUrl = regEx.Replace(cell.Value, "")
NormalizeToUrl = Replace(NormalizeToUrl, "&", "and") ' & TURNS INTO "and"
With regEx
.Global = True
.Pattern = strPattern ' WE REPLACE ALL NON-WORD CHARS WITH HYPHEN
End With
NormalizeToUrl = LCase(regEx.Replace(Replace(NormalizeToUrl, " ", "-"), "-"))
With regEx
.Global = True
.Pattern = "--+" ' WE SHRINK ALL HYPHEN SEQUENCES TO SINGLE HYPHEN
End With
NormalizeToUrl = regEx.Replace(NormalizeToUrl, "-")
End Function