我有一个String数组列表,我想找
//results.getOptions() is an ArrayList<String>
Integer counter = 0;
for ( String option : results.getOptions() ) {
System.err.println( "Item "+counter+" :"+option );
counter++;
}
这段代码的输出是:
Item 0 :<ET>read input: 11.844ms</ET>
Item 1 :<ET>import: 2069.9ms</ET>
Item 2 :<ET>calc: 23.022ms</ET>
Item 3 :<ET>decompress .tax: 5.451ms</ET>
Item 4 :<ET>decrypt .tax: 4.409ms</ET>
Item 5 :<ET>load .tax formsets: 7.929ms</ET>
Item 6 :<ET>There were 4 calc errors:
Item 7 :F941 0 ZIP 0 - <Error><FormCd>INWKS941</FormCd><Level>Fatal</Level><Source>Company</Source><Entity>50-7754170</Entity><Category>CompanyInfo</Category><Message>Zip code is invalid. You must enter a valid ZIP code for your state. Enter a correct ZIP code in this format 'nnnnn' or 'nnnnn-nnnn'.</Message></Error>.
Item 8 :F941 0 STATE 0 - <Error><FormCd>INWKS941</FormCd><Level>Fatal</Level><Source>Company</Source><Entity>50-7754170</Entity><Category>CompanyInfo</Category><Message>State abbreviation is invalid. Enter your two-letter postal state abbreviation.</Message></Error>.
Item 9 :F941 0 L11 0 - <Error><FormCd>INWKS941</FormCd><Level>Fatal</Level><Source>Company</Source><Entity>50-7754170</Entity><Category>Calculation</Category><Message>Total taxes after adjustments does not equal the total quarter liability on Schedule B. You must make the necessary adjustments to reconcile the amounts.</Message></Error>.
Item 10 :F941 0 L15 0 - <Error><FormCd>INWKS941</FormCd><Level>Informational</Level><Source>Company</Source><Entity>50-7754170</Entity><Category>FormInfo</Category><Message>There is a balance due on this form of $6567.78.</Message></Error>.
Item 11 :: 0.034ms</ET>
Item 12 :<ET>write FormML: 8.739ms</ET>
Item 13 :<ET>flush FormML: 0.602ms</ET>
Item 14 :<ET>copy FormML to output vector: 1.763ms</ET>
Item 15 :<ET>convert: 2147.71ms</ET>
Item 16 :<ET>write output: 0.782ms</ET>
Item 17 :<FORMSET id="FORMML"/>
Item 18 :<DATA size="247750"/>
Item 19 :<ERROR code="0"/>
Item 20 :
我希望捕获以:
开头的文本行(索引)<ET>There were 4 calc errors:
结束于:
</ET>
(输出中的第6-11项)
我捕捉这些特定线条的正则表达式是什么。我有一段Java代码会返回索引但是正则表达式捕获这些行会是什么?
List<String> getMatchingStrings(List<String> list, String regex) {
ArrayList<String> matches = new ArrayList<String>();
Pattern p = Pattern.compile(regex);
for (String s:list) {
if (p.matcher(s).matches()) {
matches.add(s);
}
}
return matches
}
答案 0 :(得分:1)
如果您尝试使用正则表达式单独匹配列表中的每个字符串:
正则表达式\\<ET\\>There were 4 calc errors:.*\\</ET\\>
应该有效。因为特殊字符是转义的,.*
匹配标记之间的所有字符。
答案 1 :(得分:1)
首先,正则表达式在单个字符串上运行,因此您必须将单独的字符串组合在一起。您可能希望将它们视为线条,因此这将起作用。
StringBuilder buf = new StringBuilder();
for (String option : results.getOptions()) {
buf.append(option).append("\r\n");
}
接下来,您需要一个适用于多行的正则表达式,因此您需要DOTALL
选项(&#34;在dotall模式下,表达式。匹配任何字符,包括行终止符&#34;)
另外,你需要正则表达式&#34;不情愿&#34;,与#34;贪婪&#34;相反,所以你需要.*?
,而不是.*
,而你想要在开始模式和结束模式之间捕获文本,因此您需要一个捕获组()
。
String regex = "<ET>There were 4 calc errors:(.*?)</ET>";
Pattern p = Pattern.compile(regex, Pattern.DOTALL);
Matcher m = p.matcher(buf.toString());
while (m.find()) {
String errorText = m.group(1);
// Use errorText here
}
当然,如果没有4个错误,你可以使用它:
String regex = "<ET>There were \\d+ calc errors:(.*?)</ET>";
捕获的文本将以换行符开头,因此您可以trim()
错误文本,也可以添加到模式中。
String regex = "<ET>There were \\d+ calc errors:\r\n(.*?)</ET>";