我正在努力提取'和'' a''''''&# 39;&安培; amp;'从一个文本块以及所有数字的存在。
我试图为此目的创建不同的正则表达式但未能获得准确的结果。
所有数字都被提取得很好但我无法通过正则表达式获取所有上述字符串。
我的基本正则表达式是
Pattern p = Pattern.compile("^[0-9]");
然后我尝试了不同的组合,比如
Pattern p = Pattern.compile("^[0-9](&)");
Pattern p = Pattern.compile("^[0-9]+[&]");
获得上述字符串但没有用。
文字示例:
System requirements: iOS 6.0 and Android (varies) &
Version used in this guide: 2.2.4 (iPhone), 13.1.2 (Android)
预期结果
6.0,and,&,2.2.4,13.1.2
答案 0 :(得分:1)
你无法接近你的“尝试”,我几乎感觉不好只是给你解决方案,但如果你真的“热衷于学习新事物”(正如你在SO资料中所说),看看在正则表达式教程。
alternation,grouping,quantifiers和anchors(/ word boundaries)的基本用法将解决您的问题。
(\b(?:a|an|and|the)\b|&|\d+(?:\.\d+)*)
说明:
NODE EXPLANATION
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
--------------------------------------------------------------------------------
(?: group, but do not capture:
--------------------------------------------------------------------------------
a 'a'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
an 'an'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
and 'and'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
the 'the'
--------------------------------------------------------------------------------
) end of grouping
--------------------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
& '&'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more
times (matching the most amount
possible)):
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
) end of \1
要在Java中使用,您必须每个\
转义。
(\\b(?:a|an|and|the)\\b|&|\\d+(?:\\.\\d+)*)
答案 1 :(得分:0)
您可以使用以下正则表达式:
ID L1 L2 Year JR FR MR AR MYR JR JLR AGR SR OR NR DR JA FA MA AA MYA JA JLA AGA SA OA NA DA
1234 89 65 2003 11 34 6 7 8 90 65 54 3 22 55 66 76 86 30 76 43 67 13 98 67 0 127 74
1234 45 76 2004 67 87 98 5 4 3 77 8 99 76 56 4 3 2 65 78 44 53 67 98 79 53 23 65
请参阅DEMO