如何使用excel VBA脚本删除某些字符

时间:2011-11-19 17:49:47

标签: regex excel vba excel-vba

以下VBA脚本删除了不需要的字符,但遗憾的是只有NUMBERS。

你能不能请我帮忙,它也需要摆脱字母,如下面的表格示例(粗体)。

范围可以是0到15000+单元格

............................................... ......

a a york a a

b b b b

c new c york c watertown c ny c

6 ave 6 6 york 6 城市 6

............................................... .......

VBA脚本:

Sub Remove()

Application.ScreenUpdating = False
Dim R As RegExp, C As Range
For Each C In Range("A1:A" & Cells(Rows.Count, 1).End(xlUp).Row)
If R Is Nothing Then
Set R = New RegExp
R.Global = True
R.Pattern = "\D"
C.Offset(0, 1) = R.Replace(C, "")
R.Pattern = "\d"
C = R.Replace(C, "")
End If
Set R = Nothing
Next C
Application.ScreenUpdating = True
End Sub

EDIT1

Sub Remove()
Call BackMeUp

Dim cell As Range
Dim RE As Object
Dim Whitecell As Range
Dim strFind As String, strReplace As String
Dim lLoop As Long
Dim Loop1 As Long

    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual
    Application.EnableEvents = False
Range("A3:L3").Select
Selection.Delete Shift:=xlUp
'--------------------------------------------------Remove JUNK
Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).Select
On Error Resume Next
For lLoop = 1 To 100
    strFind = Choose(lLoop, "~?»", "~®", "~.", "~!", "~ï", "~-", "~§", "~$", "~%", "~&", "~/", "~\", "~,", "~(", "~)", "~=", "~www", "~WWW", "~.com", "~.net", "~.org", "~{", "~}", "~[", "~]", "~ï", "~¿", "~½", "~:", "~;", "~_", "~µ", "~@", "~#", "~'", "~|", "~€", "~ä", "~ö", "~ü", "~Ä", "~Ü", "~Ö", "~+", "~<", "~>", "~nbsp", "~â", "~¦", "~©", "~Â", "~–", "~¼", "~?")
    strReplace = Choose(lLoop, " ")

    Selection.Replace What:=strFind, Replacement:=strReplace, LookAt:=xlPart, _
    SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
    ReplaceFormat:=False

Next lLoop
'--------------------------------------------------Remove Numbers
Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).Select
On Error Resume Next
For Loop1 = 1 To 40
    strFind = Choose(lLoop, "~1", "~2", "~3", "~4", "~5", "~6", "~7", "~8", "~9", "~0")
    strReplace = Choose(Loop1, " ")

    Selection.Replace What:=strFind, Replacement:=strReplace, LookAt:=xlPart, _
    SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
    ReplaceFormat:=False

Next Loop1
'--------------------------------------------------Remove Single Letters
Set RE = CreateObject("vbscript.regexp")
RE.Global = True
RE.MultiLine = True
RE.Pattern = "^[a-z]\b | \b[a-z]\b"

For Each cell In Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row)
    cell.Value = RE.Replace(cell.Value, "")

Next

'--------------------------------------------------Remove WHITE SPACES

For Each Whitecell In Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row)
    Whitecell = WorksheetFunction.Trim(Whitecell)
Next Whitecell

'--------------------------------------------------Remove DUPES

ActiveSheet.Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).RemoveDuplicates Columns:=1, Header:=xlYes
ActiveWorkbook.Worksheets(ActiveSheet.Name).Sort.SortFields.Clear

'--------------------------------------------------Copy to B - REPLACE ALL WHITE IN B

Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).Select
    Selection.Copy
    Range("B3:B" & Cells(Rows.Count, 1).End(xlUp).Row).Select
    ActiveSheet.Paste
    Selection.Replace What:=" ", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
        ReplaceFormat:=False
        Columns("A:L").EntireColumn.AutoFit

'--------------------------------------------------END
    Application.ScreenUpdating = True
    Application.Calculation = xlCalculationAutomatic
    Application.EnableEvents = True
    Range("a1").Select
End Sub

3 个答案:

答案 0 :(得分:4)

编辑删除了原始答案,因为它在收到您想要的更多信息后却不适用,但是留下了建议)

  • 您正在创建/销毁每个单元格的RE对象,即 贵/ unnessessary
    • 如果其他用户将使用该功能,请在代码中创建对象,而不是添加引用
    • 最后不需要将正则表达式对象设置为空 - 在函数末尾从内存中释放变量 自动
    • 改进变量命名并使用适当的缩进可以帮助提高可读性并使其更易于编辑
    • 添加多行选项,以防您的单元格中包含换行符。
    • 如果使用大量单元格,您可能需要使用变量数组

UDPATE 2

基于下面的评论,这里是如何仅出现两个或更多个小写字符以及中间的单个空格。我个人认为一个好的方法是提取你 DO 想要的东西,而不是替换你不想想要的东西。我在这个网站上分享了以下功能,因为它真的很有用。以下是如何在A列的内容上调用它并将结果放在B列中的示例。

Sub test()

' Show how to run this on cells in A and transpose result in B
Dim varray As Variant
Dim i As Long

Application.ScreenUpdating = False
varray = Range("A1:A15000").Value

For i = 1 To UBound(varray, 1)
    varray(i, 1) = RegexExtract(varray(i, 1), "([a-z]{2,})", " ")
Next

Range("B1").Resize(UBound(varray, 1)).Value = _
Application.WorksheetFunction.Transpose(varray)

Application.ScreenUpdating = True

End Sub

并确保它在模块中:

Function RegexExtract(ByVal text As String, _
                      ByVal extract_what As String, _
                      Optional seperator As String = "") As String

Dim i As Long
Dim j As Long
Dim result As String
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")

RE.Pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)

For i = 0 To allMatches.Count - 1
    For j = 0 To allMatches.Item(i).submatches.Count - 1
        result = result & seperator & allMatches.Item(i).submatches.Item(j)
    Next
Next

If Len(result) <> 0 Then
    result = Right$(result, Len(result) - Len(seperator))
End If

RegexExtract = result

End Function

答案 1 :(得分:3)

你的&#34; R.Pattern =&#34; \ d&#34;是你需要改变的唯一一条线。 &#34; \ d&#34;是一个描述&#34;数字&#34;的正则表达式。

我建议改变&#34; \ d&#34;到&#34; ^ [a-z0-9] | [A-Z0-9] \ B&#34;作为一个起点。

答案 2 :(得分:3)

我在下面重写了你的代码,以便

  • RegExp仅创建一次。您当前的代码会创建一个新对象,然后针对每个被测试的单元销毁它,因为它位于循环中
  • 下面的代码使用变量数组来最小化操作每个单元格值时的处理时间。常数VbNullString比“”快一点。
  • 你使用正则表达式中的简单\ w来匹配任何a-z0-9
  • RegExp对象的后期绑定避免了让第三方设置引用的必要性,将ignore case设置为true会使您的替换不区分大小写

         Sub Remove()
         Dim R As Object
         Dim C As Range
         Dim lngrow As Long
         Dim rng1 As Range
         Dim X
         Set R = CreateObject("vbscript.regexp")
         With R
           .Global = True
           .Pattern = "^\w\s|\b\w\b"
           .ignoreCase = True
         End With
         Application.ScreenUpdating = False
         Set rng1 = Range("A1:A" & Cells(Rows.Count, 1).End(xlUp).Row)
         X = rng1.Value2
         For lngrow = 1 To UBound(X, 1)
           X(lngrow, 1) = R.Replace(X(lngrow, 1), vbNullString)
         Next lngrow
         rng1.Value2 = X
         Application.ScreenUpdating = True
          End Sub