从字符串中删除所有非单词字符(标点符号)

时间:2016-12-24 00:32:04

标签: java string special-characters removing-whitespace

好的,这是我第一次发帖,所以如果我犯了任何错误,你不得不原谅我。总而言之,我给出了一系列字符串,我的目标是保持字符串中唯一单词的计数以及从数组中删除任何标点符号。

public static HashMap<String, Integer> uniqueWords(String[] book) {
    HashMap<String, Integer> hm = new HashMap<>();

    for (int i = 0; i < book.length; i++) {
        if (hm.containsKey(book[i])) {
            hm.put(book[i], hm.get(book[i]) + 1);
        } else {
            book[i] = book[i].replaceAll("[^a-zA-Z]","").replaceAll("\\p{Punct}","").replaceAll("\\W+","").replaceAll("\\n","").toLowerCase();
            hm.put(book[i], 1);
        }
    }
    return hm;
}

输入:{“Redfish”,“redfish”,“redfish”,“Bluefish”,“bluefish”,“bluefish”,“*”,“%”,“”};

输出:{= 2,bluefish = 3,redfish = 3}

所以我成功删除了任何空格,但我仍然有星号和百分位数。

感谢任何帮助,谢谢。

1 个答案:

答案 0 :(得分:0)

尝试这样的事情 -

    public static HashMap<String, Integer> uniqueWords(String[] book) {
    HashMap<String, Integer> hm = new HashMap<>();
string strBook = "";
int key = 1;
    for (int i = 0; i < book.length; i++) {
    strBook= book[i].replaceAll("[^a-zA-Z]","").replaceAll("\\p{Punct}","").replaceAll("\\W+","").replaceAll("\\n","").toLowerCase();
        if (!hm.containsKey(strBook)) {
            hm.put(key, strBook);
            key++;
        }
    }
    return hm;
}