在特定模式之间提取

时间:2016-04-15 15:27:03

标签: regex groovy extract

我必须提取一些子字符串,这就像纯文本文档中的XML标记,如

!include nsDialogs.nsh
!include WinMessages.nsh
Page Custom MyPageCreate
Page InstFiles

Function OnComboChange
Pop $0
SendMessage $0 ${CB_GETCURSEL} "" "" $2
System::Call 'USER32::SendMessage(i$0,i${CB_GETLBTEXT},i$2,t.r2)'
MessageBox mb_ok "OnComboChange: $2"
FunctionEnd

Function EmulateChangeMethodA
Pop $0 ; Throw away parameter we don't care about
Push $1
Call OnComboChange
FunctionEnd

Function EmulateChangeMethodB
Pop $0 ; Throw away parameter we don't care about
FindWindow $0 "EDIT" "" $1 
SendMessage $1 ${WM_COMMAND} 0x4000000 $0 ; Send WM_COMMAND,MAKELONG(0,EN_UPDATE),hwndEdit
FunctionEnd

Function EmulateChangeMethodC
Pop $0 ; Throw away parameter we don't care about
!ifndef CB_GETCOMBOBOXINFO
!define CB_GETCOMBOBOXINFO 0x0164
!endif
System::Call '*(&l4,&i16,&i16,i,i,i,i)i.r2'
SendMessage $1 ${CB_GETCOMBOBOXINFO} "" $2 ; This only works on Vista+?
System::Call '*$2(i,&i16,&i16,i,i,i,i.r0)'
System::Free $2
SendMessage $1 ${WM_COMMAND} 0x30000 $0 ; ; Send WM_COMMAND,MAKELONG(0,LBN_SELCANCEL),hwndList
FunctionEnd

Function MyPageCreate
nsDialogs::Create 1018
Pop $0

${NSD_CreateCombobox} 0 30u 100% 200u ""
Pop $1 ; This is the only control handle we care about in this example so make sure to never overwrite it!
${NSD_CB_AddString} $1 "Foo"
${NSD_CB_AddString} $1 "Bar"
SendMessage $1 ${CB_SETCURSEL} 0 ""
${NSD_OnChange} $1 OnComboChange

${NSD_CreateButton} 0 50u 33% 12u "Emulate change: A"
Pop $0
${NSD_OnClick} $0 EmulateChangeMethodA
${NSD_CreateButton} 33% 50u 33% 12u "Emulate change: B"
Pop $0
${NSD_OnClick} $0 EmulateChangeMethodB
${NSD_CreateButton} 66% 50u 33% 12u "Emulate change: C"
Pop $0
${NSD_OnClick} $0 EmulateChangeMethodC

nsDialogs::Show
FunctionEnd

我可以在一个命令中提取这个模式吗?

在这种情况下,我尝试使用matcher,group命令来提取这个单一匹配。

我不想做像

这样的事情
lsdkfjsdklfj sdklfsdklfjsd <AAA>myString</AAA>sdfsdfsdfsdf

必须有一种更优雅的方式。

编辑: 谢谢time_yates,我正在寻找类似的东西。

你能解释为什么在

的结果上使用[0] [1]
String pattern = /<AAA>(.*)<\/AAA>/;

// Create a Pattern object
Pattern r = Pattern.compile(pattern);

// Now create matcher object.
Matcher m = r.matcher("lsdkfjsdklfj sdklfsdklfjsd <AAA>myString</AAA>sdfsdfsdfsdf");
if (m.find( )) {
    System.out.println("Found value: " + m.group(0) );
}

回答tim_yates:

=〜返回一个匹配器,所以[0]得到第一个匹配,即2组,第一个是匹配的字符串(整个字符串)第二个[1]是你定义的组在你的表达

非常感谢您的帮助,感谢所有读者。 社区的力量!!!

2 个答案:

答案 0 :(得分:0)

你不能这样做:

def input = 'lsdkfjsdklfj sdklfsdklfjsd <AAA>myString</AAA>sdfsdfsdfsdf'
def extract = (input =~ '<AAA>(.+?)</AAA>')[0][1]
assert extract == 'myString'

答案 1 :(得分:0)

这是我能想到的没有外部库的最短(不是最好)方式:

String str = "lsdkfjsdklfj sdklfsdklfjsd <AAA>myString</AAA>sdfsdfsdfsdf";
System.out.println(str.substring(str.indexOf(">") + 1, str.lastIndexOf("<")));

或者使用StringUtils(比我以前使用子字符串的消息要好上百万倍):

StringUtils.substringBetween(str, "<AAA>", "</AAA>");

我仍然会像你提出的那样matcher()