如何使用正则表达式过滤掉列入黑名单(淫秽)单词的列表,以便 如果列出黑名单的话就像'比尔约瑟夫'
Then 'I am Bill Josephine' is valid
but 'I am Bill Joseph.' is invalid
'I am Bill Joseph,' is invalid
'I am Bill Joseph ' invalid
'I am Bill Joseph<any non alphanumeric>' is invalid.
Similarly 'I am .Bill Joseph' is invalid
'I am <any non alphanumeric>Bill Joseph' is invalid.
答案 0 :(得分:1)
简单,这有效:
String badStrRegex = "\\WBill Joseph\\W?";
Pattern pattern = Pattern.compile(badStrRegex);
Matcher m = pattern.matcher(testStr); //testStr is your string under test
boolean isBad = m.find();
有效!!测试了所有输入。
答案 1 :(得分:1)
使用字母数字字符类的否定:
“[^ A-Za-z0-9] Bill Joseph [^ A-Za-z0-9]”
使用“\ W”代替“[^ A-Za-z0-9]”在大多数情况下都有效,除非在名称之前/之后有下划线。所以“Bill Joseph_”仍然被视为有效。
答案 2 :(得分:0)
确保单词边界".*\\b" + badWord + "\\b.*"