使用VBA清除具有多个定界符的数据

时间:2019-01-20 02:02:52

标签: excel vba data-cleaning

我有一个手动输入的票号列表,带有不同的分隔符和一些字符串。我正在尝试对其进行清理,以使其票号仅以“ AK1”开头,并以逗号分隔。

enter image description here

Data

3 个答案:

答案 0 :(得分:1)

多次拆分

脏版本

在此版本中,所有字符串都被分割并使用一个单元原则写入一个(另一个)单元。

Sub MultiSplit()

    Const cDel As String = ";,/"  ' Delimiter List
    Const cCol1 As Variant = "A"  ' Source Column Letter/Number
    Const cCol2 As Variant = "B"  ' Target Column Letter/Number
    Const cDelR As String = ","   ' Replace Delimiter
    Const cFirstR As Long = 1     ' First Row Number

    Dim vntR As Variant   ' Range Array
    Dim vntD As Variant   ' Delimiter Array

    Dim LastR As Long     ' Last Row Number
    Dim i As Long         ' Range Array Row Counter
    Dim j As Long         ' Delimiter Array Row Counter

    ' Calculate Last Row Number.
    LastR = Cells(Rows.Count, cCol1).End(xlUp).Row

    ' Copy Source Range into Range Array.
    vntR = Range(Cells(cFirstR, cCol1), Cells(LastR, cCol1))

    ' Split Delimiter List into Delimiter Array
    vntD = Split(cDel, ",")

    ' Calculate values in Range Array.
    For i = 1 To UBound(vntR) ' Range Array
        For j = 0 To UBound(vntD) ' Delimiter Array
            ' Replace by overwriting.
            vntR(i, 1) = Replace(vntR(i, 1), vntD(j), cDelR)
        Next
    Next

    ' Copy Range Array to Target Range.
    Range(Cells(cFirstR, cCol2), Cells(LastR, cCol2)) = vntR

End Sub

清理一个字符串版本

如果您想在一个单元格中使用所有AK1票证,请使用以下命令 码。调整cDelC(最终定界符)以适合您的需求(例如,aa,aa或aa,aa)。

Sub MultiSplit2()

    Const cDel As String = ";,/"     ' Delimiter List
    Const cCol1 As Variant = "A"     ' Source Column Letter/Number
    Const cCol2 As Variant = "B"     ' Target Column Letter/Number
    Const cDelR As String = ","      ' Replace Delimiter
    Const cFirstR As Long = 1        ' First Row Number
    Const cDelC As String = ", "     ' Clean Delimiter
    Const cString As String = "AK1"  ' Desired Start String

    Dim vntR As Variant   ' Range Array
    Dim vntD As Variant   ' Delimiter Array
    Dim vntT As Variant   ' Temporary Array

    Dim LastR As Long     ' Last Row Number
    Dim i As Long         ' Range Array Row Counter
    Dim j As Long         ' Delimiter Array Row Counter
    Dim strT As String    ' Target String


    ' Calculate Last Row Number.
    LastR = Cells(Rows.Count, cCol1).End(xlUp).Row

    ' Copy Source Range into Range Array.
    vntR = Range(Cells(cFirstR, cCol1), Cells(LastR, cCol1))

    ' Split Delimiter List into Delimiter Array
    vntD = Split(cDel, ",")

    ' Calculate values in Range Array.
    For i = 1 To UBound(vntR) ' Range Array
        For j = 0 To UBound(vntD) ' Delimiter Array
            ' Replace by overwriting.
            vntR(i, 1) = Replace(vntR(i, 1), vntD(j), cDelR)
        Next
        Debug.Print vntR(i, 1)
    Next

    ' Clean the strings in Range Array.
    For i = 1 To UBound(vntR)
        vntT = Split(vntR(i, 1), cDelR)
        For j = 0 To UBound(vntT)
            If Left(Trim(vntT(j)), Len(cString)) = cString Then
                If strT <> "" Then
                    strT = strT & cDelC & Trim(vntT(j))
                  Else
                    strT = Trim(vntT(j))
                End If
            End If
        Next
    Next

    ' Copy Target String to Target Cell.
    Cells(cFirstR, cCol2) = strT

End Sub

答案 1 :(得分:0)

我建议使用UDF(用户定义的函数)执行此操作。将以下代码安装在标准代码模块中(按Alt + F11打开VB编辑器窗口。右键单击位于项目浏览器窗口中的VBA项目)。在左侧,选择“插入”>“模块”,然后将代码粘贴到右侧的空白代码面板中。请记住以xlsm(启用宏)格式保存工作簿。

Function ExtractAK1(Cell As Range) As String

    Const AK1 As String = "AK1-"

    Dim Var As Variant
    Dim Sp() As String
    Dim i As Integer

    Var = Cell.Value
    If VarType(Var) = vbString Then
        If InStr(1, Var, AK1, vbTextCompare) Then
            Sp = Split(Trim(Var), AK1)
            For i = 1 To UBound(Sp)
                Sp(i) = AK1 & Left(Trim(Sp(i)), 5)
            Next i
            Var = Join(Sp, ",")
            ExtractAK1 = Mid(Var, InStr(Var, ",") + 1)
        End If
    End If
End Function

像调用内置Excel函数一样,在工作表中调用该函数,

=ExtractAK1($A2)

如果安装正确,当您开始键入该函数时,Excel会建议该函数的名称。 $ A2是包含您的文本的单元格。根据需要向下复制公式。 如果这是更方便的使用方法,则可以重新使用该功能以在循环中使用。

答案 2 :(得分:0)

以下UDF仅将您输入的任何内容提取到以逗号分隔的AK凭单编号列表中。假定票证号码格式为AK-,后面仅是数字,即您所显示的。而且只提取票号,也提取您想要的内容。

  • 如果您输入单个字符串或单个单元格,这些内容将会显示出来。
  • 如果您输入一定范围的单元格,它们将合并为一个输出字符串。
  • VBA的正则表达式引擎用于提取票证编号

Option Explicit
  Public RE As Object
  Public MC As Object
  Public M As Object

    'Assume starts with AK- and ends with numbers
    '  as per your example
 Public Const sPat As String = "\bAK1-\d+"

Function getAK(vIN As Variant) As String
    Dim V As Variant
    Dim sTemp As String

Set RE = CreateObject("vbscript.regexp")
With RE
    .Pattern = sPat
    .ignorecase = False
    .Global = True
End With

If IsArray(vIN) Then
    For Each V In vIN
        sTemp = sTemp & "," & getStrOnly(CStr(V))
    Next V
Else
    getAK = getStrOnly(CStr(vIN))
    Exit Function
End If

getAK = Mid(sTemp, 2)

End Function

Private Function getStrOnly(str As String) As String
    Dim sTemp As String
    With RE
        If .test(str) = True Then
            Set MC = .Execute(str)
            For Each M In MC
                sTemp = sTemp & "," & M
            Next M
        End If
    End With
    getStrOnly = Mid(sTemp, 2)
End Function

使用单个单元格版本:getAK(A1)

enter image description here

使用多单元格方法:

=getAK(A1:A12)

我们得到

AK1-97760,AK1-96767,AK1-97719,AK1-97999,AK1-98105,,AK1-97113,AK1-97073,AK1-97019,AK1-97951,AK1-97858,AK1-97195,AK1-96806,AK1-97719,AK1-97896,AK1-98115,AK1-98151,AK1-98089,AK1-96780,AK1-90919,AK1-96705,AK1-96806,AK1-95397

如果您还想返回票证的状态(票证号码后面括号中的部分),则可以将正则表达式更改为:

"\bAK1-\d+(?:\s*\([^)]+\))?"

如果您的票证模式不同,则还可以相应地更改正则表达式。