我有以下模式:
private const string _usernamePattern = "Username: <strong>.*</strong>";
和代码:
private string Grab(string text, string pattern)
{
Regex regex = new Regex(pattern);
if (!regex.IsMatch(text))
throw new Exception();
else
return regex.Match(text).Value;
}
因此,它对于像这样的字符串也可以正常工作
:Username: <strong>MyUsername</strong>
但是我只需要抓住MyUsername
,而无需<strong>
标签。
怎么做?
答案 0 :(得分:2)
您实际上不应该使用正则表达式来执行此操作,而应该使用专用的html解析器。
看到这个关于为什么的问题
RegEx match open tags except XHTML self-contained tags
但是,如果这是一个非常有限的情况而不是html块,而您想要的只是两个标记之间的文本,则可以使用以下模式...
Sub CombineTextFiles()
Dim fso As Object
Dim xlsheet As Worksheet
Dim qt As QueryTable
Dim txtfilesToOpen As Variant, txtfile As Variant
Application.ScreenUpdating = False
Set fso = CreateObject("Scripting.FileSystemObject")
txtfilesToOpen = Application.GetOpenFilename _
(FileFilter:="Text Files (*.csv), *.csv", _
MultiSelect:=True, Title:="Text Files to Open")
For Each txtfile In txtfilesToOpen
' FINDS EXISTING WORKSHEET
For Each xlsheet In ThisWorkbook.Worksheets
If xlsheet.Name = Replace(fso.GetFileName(txtfile), ".csv", "") Then
xlsheet.Activate
GoTo ImportCSV
End If
Next xlsheet
' CREATES NEW WORKSHEET IF NOT FOUND
Set xlsheet = ThisWorkbook.Worksheets.Add( _
After:=ThisWorkbook.Sheets(ThisWorkbook.Sheets.Count))
xlsheet.Name = Replace(fso.GetFileName(txtfile), ".csv", "")
xlsheet.Activate
GoTo ImportCSV
ImportCSV:
' DELETE EXISTING DATA
ActiveSheet.Range("A:Z").EntireColumn.Delete xlShiftToLeft
' IMPORT DATA FROM TEXT FILE
With ActiveSheet.QueryTables.Add(Connection:="TEXT;" & txtfile, _
Destination:=ActiveSheet.Cells(1, 1))
.TextFileParseType = xlDelimited
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = False
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = False
.TextFileSpaceDelimiter = False
.TextFileOtherDelimiter = "|"
.Refresh BackgroundQuery:=False
End With
For Each qt In ActiveSheet.QueryTables
qt.Delete
Next qt
Next txtfile
Application.ScreenUpdating = True
MsgBox "Successfully imported text files!", vbInformation, "SUCCESSFUL IMPORT"
Set fso = Nothing
End Sub
答案 1 :(得分:1)
尝试:
private const string _usernamePattern = "Username: <strong>(?<Email>.*)</strong>";
...
private string Grab(string text, string pattern)
{
var match = Regex.Match(text, pattern);
if (!match.Success)
throw new Exception();
else
return match.Groups["Email"].Value;
}