我正在尝试从字符串中删除字母数字字。
String[] sentenceArray= {"India123156 hel12lo 10000 cricket 21355 sport news 000Fifa"};
for(String s: sentenceArray)
{
String finalResult = new String();
String finalResult1 = new String();
String str= s.toString();
System.out.println("before regex : "+str);
String regex = "(\\d?[,/%]?\\d|^[a-zA-Z0-9_]*)";
finalResult1 = str.replaceAll(regex, " ");
finalResult = finalResult1.trim().replaceAll(" +", " ");
System.out.println("after regex : "+finalResult);
}
输出:hel lo cricket体育新闻国际足联
但我要求的输出是:板球运动新闻
伙计们请帮忙.. 提前谢谢
答案 0 :(得分:2)
要匹配您要排除的字词和以下空格字符,您可以在不区分大小写的模式(demo)中使用以下正则表达式:
\b(?=[a-z]*\d+)\w+\s*\b
在Java中,要替换它,您可以执行以下操作:
String replaced = your_original_string.replaceAll("(?i)\\b(?=[a-z]*\\d+[a-z]*)\\w+\\s*\\b", "");
Token-by-Token说明
\b # the boundary between a word char (\w) and
# something that is not a word char
(?= # look ahead to see if there is:
[a-z]* # any character of: 'a' to 'z' (0 or more
# times (matching the most amount
# possible))
\d+ # digits (0-9) (1 or more times (matching
# the most amount possible))
) # end of look-ahead
\w+ # word characters (a-z, A-Z, 0-9, _) (1 or
# more times (matching the most amount
# possible))
\s* # whitespace (\n, \r, \t, \f, and " ") (0 or
# more times (matching the most amount
# possible))
\b # the boundary between a word char (\w) and
# something that is not a word char
答案 1 :(得分:2)
public static void main(String[] args) {
String s = "India123156 hel12lo 10000 cricket 21355 sport news 000Fifa";
// String s = "cricket abc";
// cricket sport news
System.out.println(s.replaceAll("\\b\\w+?[0-9]+\\w+?\\b", "").trim());
}
O / P:
cricket sport news
Explaination :
\\b --> word boudry i.e, it marks the beginning and end of a word..
\\w+ -->one or more alphabets .
\\w+?[0-9] --> Zero or one occurance of (one or more alphabets) followed by one or more digits.
\\w+?--> ending with Zero or one occurance of (one or more alphabets) and marked by word boundry.
trim() --> removing leading and trailing whitespaces.