我正在尝试匹配文本中重复的模式。当我尝试匹配模式时,它会匹配两者之间的任何内容。虽然我达到了预期的结果,但是如果可能的话,我可能希望对我的regex模式进行任何优化。请提出建议。
这是文本内容...
$text = @"
Microsoft (R) Windows Debugger Version 6.3.9600.17336 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [C:\Windows\MEMORY.DMP]
Kernel Summary Dump File: Only kernel address space is available
************* Symbol Path validation summary **************
Response Time (ms) Location
Deferred srv*DownstreamStore*https://msdl.microsoft.com/download/symbols
************* Symbol Path validation summary **************
Response Time (ms) Location
Deferred srv*DownstreamStore*https://msdl.microsoft.com/download/symbols
Symbol search path is: srv*DownstreamStore*https://msdl.microsoft.com/download/symbols
Executable search path is:
Windows 7 Kernel Version 7600 UP Free x64
Product: LanManNt, suite: SmallBusiness TerminalServer SmallBusinessRestricted SingleUserTS
Built by: 7600.16385.amd64fre.win7_rtm.090713-1255
Machine Name:
Kernel base = 0xfffff800`01658000 PsLoadedModuleList = 0xfffff800`01895e50
Debug session time: Tue Apr 16 04:27:05.412 2019 (UTC - 7:00)
System Uptime: 7 days 1:02:26.286
Loading Kernel Symbols
...............................................................
..............................................Page eb21d not present in the dump file. Type ".hh dbgerr004" for details
..................
.........
Loading User Symbols
PEB is paged out (Peb.Ldr = 000007ff`fffdb018). Type ".hh dbgerr001" for details
Loading unloaded module list
....
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck D1, {fffff8a0032dd010, 2, 0, fffff8800567d530}
*** ERROR: Module load completed but symbols could not be loaded for myfault.sys
Page eb21d not present in the dump file. Type ".hh dbgerr004" for details
Probably caused by : myfault.sys ( myfault+1530 )
Followup: MachineOwner
---------
----- 64 bit Kernel Summary Dump Analysis
DUMP_HEADER64:
MajorVersion 0000000f
MinorVersion 00001db0
KdSecondaryVersion 00000000
DirectoryTableBase 00000000`cadb0000
PfnDataBase fffffa80`00000000
PsLoadedModuleList fffff800`01895e50
PsActiveProcessHead fffff800`01877b30
MachineImageType 00008664
NumberProcessors 00000001
BugCheckCode 000000d1
BugCheckParameter1 fffff8a0`032dd010
BugCheckParameter2 00000000`00000002
BugCheckParameter3 00000000`00000000
BugCheckParameter4 fffff880`0567d530
KdDebuggerDataBlock fffff800`01841070
SecondaryDataState 00000000
ProductType 00000002
SuiteMask 00000131
SUMMARY_DUMP64:
DumpOptions 504d4453
HeaderSize 00024000
BitmapSize 00108000
Pages 00013cb0
Bitmap.SizeOfBitMap 00108000
KiProcessorBlock at fffff800`01900900
1 KiProcessorBlock entries:
fffff800`01842e80
Windows 7 Kernel Version 7600 UP Free x64
Product: LanManNt, suite: SmallBusiness TerminalServer SmallBusinessRestricted SingleUserTS
Built by: 7600.16385.amd64fre.win7_rtm.090713-1255
Machine Name:
Kernel base = 0xfffff800`01658000 PsLoadedModuleList = 0xfffff800`01895e50
Debug session time: Tue Apr 16 04:27:05.412 2019 (UTC - 7:00)
System Uptime: 7 days 1:02:26.286
start end module name
fffff800`0142c000 fffff800`01436000 kdcom Mon Jul 13 18:31:07 2009 (4A5BDFDB)
fffff800`0160f000 fffff800`01658000 hal Mon Jul 13 18:27:36 2009 (4A5BDF08)
fffff800`01658000 fffff800`01c35000 nt Mon Jul 13 16:40:48 2009 (4A5BC600)
fffff880`00c00000 fffff880`00c3c000 vmbus Mon Jul 13 16:42:54 2009 (4A5BC67E)
###
some similar text just to save characters
###
fffff960`00050000 fffff960`0035f000 win32k Mon Jul 13 16:40:16 2009 (4A5BC5E0)
Page eb21d not present in the dump file. Type ".hh dbgerr004" for details
fffff960`004c0000 fffff960`004de000 dxg Mon Jul 13 16:38:28 2009 (4A5BC574)
fffff960`00620000 fffff960`0062a000 TSDDD Mon Jul 13 17:16:34 2009 (4A5BCE62)
fffff960`008c0000 fffff960`008cb000 VMBusVideoD Mon Jul 13 16:43:00 2009 (4A5BC684)
fffff960`00af0000 fffff960`00b26000 RDPDD Mon Jul 13 17:16:54 2009 (4A5BCE76)
Unloaded modules:
fffff880`018e5000 fffff880`018f3000 crashdmp.sys
Timestamp: unavailable (00000000)
Checksum: 00000000
ImageSize: 0000E000
fffff880`018f3000 fffff880`018ff000 dump_ataport.sys
Timestamp: unavailable (00000000)
Checksum: 00000000
ImageSize: 0000C000
fffff880`018ff000 fffff880`01908000 dump_atapi.sys
Timestamp: unavailable (00000000)
Checksum: 00000000
ImageSize: 00009000
fffff880`00de5000 fffff880`00e00000 sacdrv.sys
Timestamp: unavailable (00000000)
Checksum: 00000000
ImageSize: 0001B000
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck D1, {fffff8a0032dd010, 2, 0, fffff8800567d530}
Page eb21d not present in the dump file. Type ".hh dbgerr004" for details
Probably caused by : myfault.sys ( myfault+1530 )
Followup: MachineOwner
---------
Finished dump check
"@
这应该返回2个可能的匹配项
([regex]"(?ms)\*{20,}.+-{8,}").Matches($text)
但是给出这个。
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck D1, {fffff8a0032dd010, 2, 0, fffff8800567d530}
*** ERROR: Module load completed but symbols could not be loaded for myfault.sys
Page eb21d not present in the dump file. Type ".hh dbgerr004" for details
Probably caused by : myfault.sys ( myfault+1530 )
Followup: MachineOwner
---------
----- 64 bit Kernel Summary Dump Analysis
DUMP_HEADER64:
MajorVersion 0000000f
MinorVersion 00001db0
KdSecondaryVersion 00000000
DirectoryTableBase 00000000`cadb0000
PfnDataBase fffffa80`00000000
PsLoadedModuleList fffff800`01895e50
PsActiveProcessHead fffff800`01877b30
MachineImageType 00008664
NumberProcessors 00000001
BugCheckCode 000000d1
BugCheckParameter1 fffff8a0`032dd010
BugCheckParameter2 00000000`00000002
BugCheckParameter3 00000000`00000000
BugCheckParameter4 fffff880`0567d530
KdDebuggerDataBlock fffff800`01841070
SecondaryDataState 00000000
ProductType 00000002
SuiteMask 00000131
SUMMARY_DUMP64:
DumpOptions 504d4453
HeaderSize 00024000
BitmapSize 00108000
Pages 00013cb0
Bitmap.SizeOfBitMap 00108000
KiProcessorBlock at fffff800`01900900
1 KiProcessorBlock entries:
fffff800`01842e80
Windows 7 Kernel Version 7600 UP Free x64
Product: LanManNt, suite: SmallBusiness TerminalServer SmallBusinessRestricted SingleUserTS
Built by: 7600.16385.amd64fre.win7_rtm.090713-1255
Machine Name:
Kernel base = 0xfffff800`01658000 PsLoadedModuleList = 0xfffff800`01895e50
Debug session time: Tue Apr 16 04:27:05.412 2019 (UTC - 7:00)
System Uptime: 7 days 1:02:26.286
start end module name
fffff800`0142c000 fffff800`01436000 kdcom Mon Jul 13 18:31:07 2009 (4A5BDFDB)
fffff800`0160f000 fffff800`01658000 hal Mon Jul 13 18:27:36 2009 (4A5BDF08)
fffff800`01658000 fffff800`01c35000 nt Mon Jul 13 16:40:48 2009 (4A5BC600)
fffff880`00c00000 fffff880`00c3c000 vmbus Mon Jul 13 16:42:54 2009 (4A5BC67E)
fffff880`00c3c000 fffff880`00c66000 ataport Mon Jul 13 16:19:52 2009 (4A5BC118)
####
Some similar text just to save characters
####
Page eb21d not present in the dump file. Type ".hh dbgerr004" for details
fffff960`004c0000 fffff960`004de000 dxg Mon Jul 13 16:38:28 2009 (4A5BC574)
fffff960`00620000 fffff960`0062a000 TSDDD Mon Jul 13 17:16:34 2009 (4A5BCE62)
fffff960`008c0000 fffff960`008cb000 VMBusVideoD Mon Jul 13 16:43:00 2009 (4A5BC684)
fffff960`00af0000 fffff960`00b26000 RDPDD Mon Jul 13 17:16:54 2009 (4A5BCE76)
Unloaded modules:
fffff880`018e5000 fffff880`018f3000 crashdmp.sys
Timestamp: unavailable (00000000)
Checksum: 00000000
ImageSize: 0000E000
fffff880`018f3000 fffff880`018ff000 dump_ataport.sys
Timestamp: unavailable (00000000)
Checksum: 00000000
ImageSize: 0000C000
fffff880`018ff000 fffff880`01908000 dump_atapi.sys
Timestamp: unavailable (00000000)
Checksum: 00000000
ImageSize: 00009000
fffff880`00de5000 fffff880`00e00000 sacdrv.sys
Timestamp: unavailable (00000000)
Checksum: 00000000
ImageSize: 0001B000
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck D1, {fffff8a0032dd010, 2, 0, fffff8800567d530}
Page eb21d not present in the dump file. Type ".hh dbgerr004" for details
Probably caused by : myfault.sys ( myfault+1530 )
Followup: MachineOwner
---------
但是我最终这样做是为了获得成功的第二部分
$text -match [regex]"(?ms)\*{20,}.+-{8,}\s+\n+-.+\n(^\*{20,}.+\*\s+[bB].+\*{20,}.+-{8,})"
$Matches[1]
所需结果只是第二部分
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck D1, {fffff8a0032dd010, 2, 0, fffff8800567d530}
Page eb21d not present in the dump file. Type ".hh dbgerr004" for details
Probably caused by : myfault.sys ( myfault+1530 )
Followup: MachineOwner
---------
答案 0 :(得分:0)
您可以对第一个正则表达式进行略微更改以使用惰性量词import re
str = "Invasive Pneumococcal Disease, Age LT 5† , Probable"
def normalize_comma_endings(matchobj):
ascii_unicode_words_pattern = r"(([^\x00-\x7F]|\w)+)"
base_word = re.findall(ascii_unicode_words_pattern, matchobj.group(1))
return "{}, ".format(base_word[0][0])
comma_endings_pattern = r"(([^\x00-\x7F]|\w)+\s,\s)"
res = re.sub(comma_endings_pattern, normalize_comma_endings, str)
print(res)
来执行此操作。然后选择两个匹配项的第二个索引值.*?
。
[1]
如果您希望同时找到两个匹配项,则只需删除索引:
([regex]"(?ms)\*{20,}.*?-{8,}").matches($text).value[1]