轻微调整用户定义的函数

时间:2015-06-10 13:47:30

标签: regex excel-vba user-defined-functions udf vba

我想从位于excel中的列内的较大字符串中提取文本和数字的组合。

我必须处理的常量是每个文本字符串

•以A,C或S开头,和 •总是7个字符长 •我想提取的字符串的位置各不相同

我一直在使用的代码一直在高效工作;

Public Function Xtractor(r As Range) As String
Dim a, ary
ary = Split(r.Text, " ")
    For Each a In ary
        If Len(a) = 7 And a Like "[SAC]*" Then
            Xtractor = a
            Exit Function
        End If
    Next a
Xtractor = ""
End Function

但是今天我了解到,有时我的数据可能包含这样的情景;

enter image description here

我想要的是调整我的代码所以如果第8个字符是" Underscore"并且7个字符的第一个字符是S,A或C请提取直到" Underscore"

其次我想排除公共话语,例如"支持" &安培; "收集"从被提取。

最后,第7个字母应为数字

对此的任何想法都将非常感激。

由于

2 个答案:

答案 0 :(得分:0)

Microsoft VBScript Regular Expressions 5.5添加到项目引用中。使用以下代码测试Xtractor的匹配和提取:

Public Function Xtractor(ByVal p_val As String) As String
  Xtractor = ""
  Dim ary As String, v_re As New VBScript_RegExp_55.RegExp, Matches
  v_re.Pattern = "^([SAC][^_]{1,6})_?"
  Set Matches = v_re.Execute(p_val)
  If Matches.Count > 0 Then Xtractor = Matches(0).SubMatches(0) Else Xtractor = ""
End Function
Sub test_Xtractor(p_cur As Range, p_val As String, p_expected As String)
  Dim v_cur As Range, v_res As Range
  p_cur.Value = p_val
  Set v_cur = p_cur.Offset(columnOffset:=1)
  v_cur.FormulaR1C1 = "='" & ThisWorkbook.Name & "'!Xtractor(RC[-1])"
  Set v_res = v_cur.Offset(columnOffset:=1)
  v_res.FormulaR1C1 = "=RC[-1]=""" & p_expected & """"
  Debug.Print p_val; "->"; v_cur.Value; ":"; v_res.Value
End Sub
Sub test()
  test_Xtractor ActiveCell, "A612002_MDC_308", "A612002"
  test_Xtractor ActiveCell.Offset(1), "B612002_MDC_308", ""
  test_Xtractor ActiveCell.Offset(2), "SUTP038_MDC_3", "SUTP038"
  test_Xtractor ActiveCell.Offset(3), "KUTP038_MDC_3", ""
End Sub

选择用于编写测试夹具的工作簿和单元格,然后从VBA编辑器运行test 立即窗口中的输出(Ctrl + G):

A612002_MDC_308->A612002:True
B612002_MDC_308->:True
SUTP038_MDC_3->SUTP038:True
KUTP038_MDC_3->:True

UPD

  

是否可以修改此代码,以便第7个字符是否为空白字母?

通过以下内容替换分配给v_re的行:

v_re.Pattern = "^([SAC](?![^_]{5}[A-Z]_?)[^_]{1,6})_?"
v_re.IgnoreCase = True

并添加到test套件:

test_Xtractor ActiveCell.Offset(4), "SUTP03A_MDC_3", ""

输出:

A612002_MDC_308->A612002:True
B612002_MDC_308->:True
SUTP038_MDC_3->SUTP038:True
KUTP038_MDC_3->:True
SUTP03A_MDC_3->:True

我插入了否定前瞻子规则(?![^_]{5}[A-Z]_?)以拒绝SUTP03A_MDC_3。但要注意:拒绝规则完全适用于第7个字符。现在v_re.IgnoreCase设置为True,但如果只允许大写字符,请将其设置为False。另请参阅MSDN上的Regular Expression Syntax

答案 1 :(得分:0)

试试这个

ary = Split(Replace(r.Text, "_", " "))

ary = Split(Replace(r.Text, "_", " ")," ")

两种变体

的结果相同

测试

enter image description here

<强>更新

  

如果第7个字符返回一个字母,你知道如何将结果留空吗?

Public Function Xtractor(r As Range) As String
Dim a, ary
ary = Split(Replace(r.Text, "_", " "))
    For Each a In ary
        If Len(a) = 7 And a Like "[SAC]*" And IsNumeric(Mid(a, 7, 1)) Then
            Xtractor = a
            Exit Function
        End If
    Next a
Xtractor = ""
End Function

测试

enter image description here