我有大型文本文档,其中包含一些我想要提取的数据。
正如您在屏幕截图中看到的,我想将 A040 提取到文件名旁边的Excel列。
在 A040 之前,总有三个空格而不是文字工作表(也在截图中)
每个文件都有不同的编号,并且总是字母A有三个数字和文本表。 - >上传示例文件:
我已经在VB中使用了Excel,但它无法正常工作。
Dim cell As Range
Dim rng As Range
Dim output As String
Set rng = ws.Range("A1", ws.Range("A1").SpecialCells(xlLastCell).Address)
For Each cell In rng
On Error Resume Next
output = ExtA(cell.Value)
If Len(output) > 0 Then
Range("B" & j) = output
Exit For
End If
Next
j = j + 1
ws.Cells.ClearContents
'Call DelConns
strFileName = Dir 'next file
Loop
End Sub
Function ExtA(ByVal text As String) As String
'REGEX Match VBA in excel
Dim result As String
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
RE.Pattern = "(?<=Sheet)[^Sheet]*\ Sheet"
RE.Global = True
RE.IgnoreCase = True
Set allMatches = RE.Execute(text)
If allMatches.Count <> 0 Then
result = allMatches.Item(0).submatches.Item(0)
End If
ExtA = result
End Function
答案 0 :(得分:3)
This seems to work on your sample.
Option Explicit
Function AthreeDigits(str As String)
Dim n As Long, nums() As Variant
Static rgx As Object, cmat As Object
'with rgx as static, it only has to be created once; beneficial when filling a long column with this UDF
If rgx Is Nothing Then
Set rgx = CreateObject("VBScript.RegExp")
Else
Set cmat = Nothing
End If
AthreeDigits = vbNullString
With rgx
.Global = False
.MultiLine = True
.Pattern = "\A[0-9]{3}[\s]{3}Sheet"
If .Test(str) Then
Set cmat = .Execute(str)
AthreeDigits = Left(cmat.Item(0), 4)
End If
End With
End Function
答案 1 :(得分:2)
你的意思是说在之后有4个空格 A040之前和#34;之前#34;?如果是这样,请尝试以下模式:
.pattern = "(A\d\d\d)\s{3}Sheet"
编辑:我以为你说了4个空格,但是你说3.我的模式现在反映了这一点。
编辑2 :(我需要更多咖啡!)将\ b更改为\ s。
答案 2 :(得分:0)
请参阅此处的示例
"\s+[Aa]\d*\s+Sheet"
或
的 \s+[Aa]\d*\s+(Sheet)
强>
或
的 [Aa]\d*\s+(Sheet)
强>
演示
的 https://regex101.com/r/Qo8iUf/3 强>
\s+
匹配任何空格字符(等于 [\r\n\t\f\v ]
)
+
量词 - 在一次和无限次之间匹配,尽可能多次匹配
Aa
匹配列表中的单个字符 Aa
(区分大小写)
\d*
匹配一个数字(等于 [0-9]
)
*
量词 - 在零和无限次之间匹配,尽可能多次