亵渎过滤器

时间:2015-05-28 16:36:08

标签: java

到目前为止,我已经能够审查“猫”,“狗”和“美洲驼”。现在我只需要将“教条”除外,但无法弄明白我的生活。下面我已经附上了我到目前为止的内容。请任何建议真的有用。

/* take userinput and determine if it contains profanity
 * if userinput contains profanity, it will be filtered 
 * and a new sentence will be generated with the word censored
 */
keyboard = new Scanner(System.in);
System.out.println("Welcome to the Star Bulletin Board!");
System.out.println("Generate your first post below!");

String userInput = keyboard.nextLine();

userInput = userInput.toLowerCase();

if (userInput.indexOf("cat") != 15){
    System.out.println("Your post contains profanity.");
    System.out.println("I have altered your post to appear as: ");
    System.out.println(userInput.replaceAll("cat", "***"));
}
else 
    System.out.println(userInput);

if (userInput.indexOf("dog") != -1){
    System.out.println("Your post contains profanity.");
    System.out.println("I have altered your post to appear as: ");
    System.out.println(userInput.replaceAll("dog", "***"));
}
if (userInput.indexOf("llama")!= -1){
    System.out.println("Your post contains profanity.");
    System.out.println("I have altered your post to appear as: ");
    System.out.println(userInput.replaceAll("llama", "*****"));
}

2 个答案:

答案 0 :(得分:4)

您可以使用字边界\\b。单词边界与单词的边缘匹配,如空格或标点符号。

if (userInput.matches(".*\\bdog\\b.*")) {
    userInput = userInput.replaceAll("\\bdog\\b", "***");
}

这将审查“不要成为骆驼。”但它不会审查“不要教条主义。”

userInput.matches(".*\\bdog\\b.*")的条件略好于indexOf / contains,因为它与替换相同。尽管没有审查任何内容,indexOf / contains仍然会显示该消息。 .*匹配任何字符(通常除了新行),可选。

注意:这仍然不是过滤亵渎的有效方法。请参阅http://blog.codinghorror.com/obscenity-filters-bad-idea-or-incredibly-intercoursing-bad-idea/

答案 1 :(得分:2)

使用字边界。看看下面的代码;对于除最后一个案件之外的所有案件,它都会打印true

String a = "what you there";
String b = "yes what there";
String c = "yes there what";
String d = "whatabout this";

System.out.println(Pattern.compile("\\bwhat\\b").matcher(a).find());
System.out.println(Pattern.compile("\\bwhat\\b").matcher(b).find());
System.out.println(Pattern.compile("\\bwhat\\b").matcher(c).find());
System.out.println(Pattern.compile("\\bwhat\\b").matcher(d).find());

您可以将所有不良单词组合成单个正则表达式,如下所示:

Pattern filter = Pattern.compile("\\b(cat|llama|dog)\\b");

这适用于简单的情况,但对于更强大的解决方案,您可能希望使用库。有关详细信息,请查看this question