在字符串中搜索停用词

时间:2017-03-27 04:10:04

标签: java string eclipse

我正在尝试搜索我的字符串" zach"对于停止词THE"," BE"," TO"," OF"," AND"," A& #34;," IN"," THAT"," I"," IT"," ON",& #34; IN"," BUT"," IS"," WITH"。

我不确定字符串搜索方法是否有效或是否有更好的方法来执行此操作。

package zk;

public class Class 
{

    public boolean isNonStopWord(int[] nums, int value)
    {

    }
    public String search( String [] Strings , String july) {
        String [] skoal = {"THE", "BE", "TO", "OF", "AND", "A", "IN",
                "THAT", "I", "IT", "ON", "IN", "BUT", "IS", "WITH"};
        for ( String i = 0, ) {
            return false;
        }
        return true;
    }

    public static void main(String [] args) {

        String zach = ("Amazon offered up more answers Thursday about what"
                + " caused a bunch of websites to fail two days ago. According "
                + "to a postmortem by the company's cloud services business, "
                + "around 9:37 a.m. PT Tuesday an Amazon worker incorrectly"
                + " punched in a command while trying to debug an issue. "
                + "That command shut down a large set of servers at Amazon Web "
                + "Services' Northern Virginia site, causing a domino effect of"
                + " problems. Other services that relied on those S3 cloud"
                + " storage servers were disrupted. Also, removing so much "
                + "server capacity required a full system restart, which then "
                + "took longer than expected, AWS said. The sites affected "
                + "included Quora, Imgur, IFTTT, Giphy and Slack. Amazon was "
                + "able to fix the issue by about 2 p.m. PT.");
        zach = zach.replace(",","");
        zach = zach.replace(".","");
        zach = zach.toUpperCase();
        String [] strings = zach.split(" ");
        for (String s1: strings) 
        {
                System.out.println(s1);

        }
    }
}

2 个答案:

答案 0 :(得分:1)

假设您使用的是Java 8+,可以使用Stream.noneMatch之类的

String[] strings = zach.split("\\s+");
for (String s1 : strings) {
    System.out.println(s1 + ": " 
            + Stream.of(skoal).noneMatch(s -> s.equals(s1)));
}

而且,\\s+匹配正则表达式中的一个(或多个)空格。

答案 1 :(得分:0)

在这里使用String#matches()会有什么问题:

public boolean hasWord(String input, String word) {
    return input.matches(".*\\b" + word + "\\b.*"));
}

// now call the above method from somewhere
public static void main (String[] args) {
    String [] skoal = {"THE", "BE", "TO", "OF", "AND", "A", "IN",
                       "THAT", "I", "IT", "ON", "IN", "BUT", "IS", "WITH"};
    String zach = "...";           // your original content
    zach = zach.replace(",", "");  // remove punctuation
    zach = zach.replace(".", "");
    zach = zach.toUpperCase();     // uppercase

    for (String stop : skoal) {
        if (hasWord(stop)) {
            System.out.println(word + " true");
        }
        else {
            System.out.println(word + " false");
        }
    }
}