计算字符串中Word的出现次数

时间:2014-03-21 18:15:18

标签: java regex string

我是Java Strings的新手,问题是我想要计算String中特定单词的出现次数。假设我的字符串是:

i have a male cat. the color of male cat is Black

现在我不想分开它,所以我想搜索一个单词" male cat"。它在我的字符串中出现两次!

我正在尝试的是:

int c = 0;
for (int j = 0; j < text.length(); j++) {
    if (text.contains("male cat")) {
        c += 1;
    }
}

System.out.println("counter=" + c);
它给了我46个反值!那么解决方案是什么?

23 个答案:

答案 0 :(得分:35)

您可以使用以下代码:

String in = "i have a male cat. the color of male cat is Black";
int i = 0;
Pattern p = Pattern.compile("male cat");
Matcher m = p.matcher( in );
while (m.find()) {
    i++;
}
System.out.println(i); // Prints 2

Demo

它做什么?

匹配"male cat"

while(m.find())

表示,在m找到匹配项时,在循环内执行任何操作。 我将i的值增加i++,所以很明显,这会给出一个字符串male cat的数量。

答案 1 :(得分:10)

如果您只想要"male cat"的计数,那么我会这样做:

String str = "i have a male cat. the color of male cat is Black";
int c = str.split("male cat").length - 1;
System.out.println(c);

如果您想确保"female cat"不匹配,请在拆分正则表达式中使用\\b字边界:

int c = str.split("\\bmale cat\\b").length - 1;

答案 2 :(得分:9)

apache commons-lang中的

StringUtils CountMatches 方法来计算一个String在另一个String中出现的次数。

   String input = "i have a male cat. the color of male cat is Black";
   int occurance = StringUtils.countMatches(input, "male cat");
   System.out.println(occurance);

答案 3 :(得分:4)

Java 8版本:

    public static long countNumberOfOccurrencesOfWordInString(String msg, String target) {
    return Arrays.stream(msg.split("[ ,\\.]")).filter(s -> s.equals(target)).count();
}

答案 4 :(得分:3)

Java 8版本。

System.out.println(Pattern.compile("\\bmale cat")
            .splitAsStream("i have a male cat. the color of male cat is Black")
            .count()-1);

答案 5 :(得分:2)

使用indexOf ...

public static int count(String string, String substr) {
    int i;
    int last = 0;
    int count = 0;
    do {
        i = string.indexOf(substr, last);
        if (i != -1) count++;
        last = i+substr.length();
    } while(i != -1);
    return count;
}

public static void main (String[] args ){
    System.out.println(count("i have a male cat. the color of male cat is Black", "male cat"));
}

这将显示:2

count()的另一个实现,只需一行:

public static int count(String string, String substr) {
    return (string.length() - string.replaceAll(substr, "").length()) / substr.length() ;
}

答案 6 :(得分:2)

static方法确实返回另一个字符串上字符串的出现次数。

/**
 * Returns the number of appearances that a string have on another string.
 * 
 * @param source    a string to use as source of the match
 * @param sentence  a string that is a substring of source
 * @return the number of occurrences of sentence on source 
 */
public static int numberOfOccurrences(String source, String sentence) {
    int occurrences = 0;

    if (source.contains(sentence)) {
        int withSentenceLength    = source.length();
        int withoutSentenceLength = source.replace(sentence, "").length();
        occurrences = (withSentenceLength - withoutSentenceLength) / sentence.length();
    }

    return occurrences;
}

<强>试验:

String source = "Hello World!";
numberOfOccurrences(source, "Hello World!");   // 1
numberOfOccurrences(source, "ello W");         // 1
numberOfOccurrences(source, "l");              // 3
numberOfOccurrences(source, "fun");            // 0
numberOfOccurrences(source, "Hello");          // 1

顺便说一下,这个方法可以写成一行,很糟糕,但它也有效:)

public static int numberOfOccurrences(String source, String sentence) {
    return (source.contains(sentence)) ? (source.length() - source.replace(sentence, "").length()) / sentence.length() : 0;
}

答案 7 :(得分:2)

为什么不递归?

public class CatchTheMaleCat  {
    private static final String MALE_CAT = "male cat";
    static int count = 0;
    public static void main(String[] arg){
        wordCount("i have a male cat. the color of male cat is Black");
        System.out.println(count);
    }

    private static boolean wordCount(String str){
        if(str.contains(MALE_CAT)){
            count++;
            return wordCount(str.substring(str.indexOf(MALE_CAT)+MALE_CAT.length()));
        }
        else{
            return false;
        }
    }
}

答案 8 :(得分:1)

用空字符串替换需要计数的字符串,然后使用不带字符串的长度来计算出现次数。

public int occurrencesOf(String word)
    {
    int length = text.length();
    int lenghtofWord = word.length();
    int lengthWithoutWord = text.replace(word, "").length();
    return (length - lengthWithoutWord) / lenghtofWord ;
    }

答案 9 :(得分:1)

公共类TestWordCount {

public static void main(String[] args) {

    int count = numberOfOccurences("Alice", "Alice in wonderland. Alice & chinki are classmates. Chinki is better than Alice.occ");
    System.out.println("count : "+count);

}

public static int numberOfOccurences(String findWord, String sentence) {

    int length = sentence.length();
    int lengthWithoutFindWord = sentence.replace(findWord, "").length();
    return (length - lengthWithoutFindWord)/findWord.length();

}

}

答案 10 :(得分:1)

这将有效

int word_count(String text,String key){
   int count=0;
   while(text.contains(key)){
      count++;
      text=text.substring(text.indexOf(key)+key.length());
   }
   return count;
}

答案 11 :(得分:0)

此处填写示例,

package com.test;

import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;

public class WordsOccurances {

      public static void main(String[] args) {

            String sentence = "Java can run on many different operating "
                + "systems. This makes Java platform independent.";

            String[] words = sentence.split(" ");
            Map<String, Integer> wordsMap = new HashMap<String, Integer>();

            for (int i = 0; i<words.length; i++ ) {
                if (wordsMap.containsKey(words[i])) {
                    Integer value = wordsMap.get(words[i]);
                    wordsMap.put(words[i], value + 1);
                } else {
                    wordsMap.put(words[i], 1);
                }
            }

            /*Now iterate the HashMap to display the word with number 
           of time occurance            */

           Iterator it = wordsMap.entrySet().iterator();
           while (it.hasNext()) {
                Map.Entry<String, Integer> entryKeyValue = (Map.Entry<String, Integer>) it.next();
                System.out.println("Word : "+entryKeyValue.getKey()+", Occurance : "
                                +entryKeyValue.getValue()+" times");
           }
     }
}

答案 12 :(得分:0)

对于scala,只有1行

def numTimesOccurrenced(text:String,word:String)= text.split(word).size-1

答案 13 :(得分:0)

公共类WordCount {

public static void main(String[] args) {
    // TODO Auto-generated method stub
    String scentence = "This is a treeis isis is is is";
    String word = "is";
    int wordCount = 0;
    for(int i =0;i<scentence.length();i++){
        if(word.charAt(0) == scentence.charAt(i)){
            if(i>0){
                if(scentence.charAt(i-1) == ' '){
                    if(i+word.length()<scentence.length()){
                        if(scentence.charAt(i+word.length()) != ' '){
                            continue;}
                        }
                    }
                else{
                    continue;
                }
            }
            int count = 1;
            for(int j=1 ; j<word.length();j++){
                i++;
                if(word.charAt(j) != scentence.charAt(i)){
                    break;
                }
                else{
                    count++;
                }
            }
            if(count == word.length()){
                wordCount++;
            }

        }
    }
    System.out.println("The word "+ word + " was repeated :" + wordCount);
}

}

答案 14 :(得分:0)

简单的解决方案就在这里 -

下面的代码使用HashMap,因为它将维护键和值。所以这里的键将是单词,值将被计数(在给定的字符串中出现一个单词)。

public class WordOccurance 
{

 public static void main(String[] args) 
 {
    HashMap<String, Integer> hm = new HashMap<>();
    String str = "avinash pande avinash pande avinash";

    //split the word with white space       
    String words[] = str.split(" ");
    for (String word : words) 
    {   
        //If already added/present in hashmap then increment the count by 1
        if(hm.containsKey(word))    
        {           
            hm.put(word, hm.get(word)+1);
        }
        else //if not added earlier then add with count 1
        {
            hm.put(word, 1);
        }

    }
    //Iterate over the hashmap
    Set<Entry<String, Integer>> entry =  hm.entrySet();
    for (Entry<String, Integer> entry2 : entry) 
    {
        System.out.println(entry2.getKey() + "      "+entry2.getValue());
    }
}

}

答案 15 :(得分:0)

public int occurrencesOf(String word) {
    int length = text.length();
    int lenghtofWord = word.length();
    int lengthWithoutWord = text.replaceAll(word, "").length();
    return (length - lengthWithoutWord) / lenghtofWord ;
}

答案 16 :(得分:0)

一旦找到该术语,您需要将其从正在处理的字符串中删除,以便再次解决不同的问题,使用indexOf()substring(),您不需要要做的包含检查长度时间

答案 17 :(得分:0)

我在这里有另一种方法:

String description = "hello india hello india hello hello india hello";
String textToBeCounted = "hello";

// Split description using "hello", which will return 
//string array of words other than hello
String[] words = description.split("hello");

// Get number of characters words other than "hello"
int lengthOfNonMatchingWords = 0;
for (String word : words) {
    lengthOfNonMatchingWords += word.length();
}

// Following code gets length of `description` - length of all non-matching
// words and divide it by length of word to be counted
System.out.println("Number of matching words are " + 
(description.length() - lengthOfNonMatchingWords) / textToBeCounted.length());

答案 18 :(得分:0)

我们可以通过多种方式计算子串的出现: -

public class Test1 {
public static void main(String args[]) {
    String st = "abcdsfgh yfhf hghj gjgjhbn hgkhmn abc hadslfahsd abcioh abc  a ";
    count(st, 0, "a".length());

}

public static void count(String trim, int i, int length) {
    if (trim.contains("a")) {
        trim = trim.substring(trim.indexOf("a") + length);
        count(trim, i + 1, length);
    } else {
        System.out.println(i);
    }
}

public static void countMethod2() {
    int index = 0, count = 0;
    String inputString = "mynameiskhanMYlaptopnameishclMYsirnameisjasaiwalmyfrontnameisvishal".toLowerCase();
    String subString = "my".toLowerCase();

    while (index != -1) {
        index = inputString.indexOf(subString, index);
        if (index != -1) {
            count++;
            index += subString.length();
        }
    }
    System.out.print(count);
}}

答案 19 :(得分:0)

子串的出现有很多种方式,主题有两种: -

public class Test1 {
public static void main(String args[]) {
    String st = "abcdsfgh yfhf hghj gjgjhbn hgkhmn abc hadslfahsd abcioh abc  a ";
    count(st, 0, "a".length());

}

public static void count(String trim, int i, int length) {
    if (trim.contains("a")) {
        trim = trim.substring(trim.indexOf("a") + length);
        count(trim, i + 1, length);
    } else {
        System.out.println(i);
    }
}

public static void countMethod2() {
    int index = 0, count = 0;
    String inputString = "mynameiskhanMYlaptopnameishclMYsirnameisjasaiwalmyfrontnameisvishal".toLowerCase();
    String subString = "my".toLowerCase();

    while (index != -1) {
        index = inputString.indexOf(subString, index);
        if (index != -1) {
            count++;
            index += subString.length();
        }
    }
    System.out.print(count);
}}

答案 20 :(得分:0)

这应该是一个更快的非正则表达式解决方案 (注意 - 不是Java程序员)

 String str = "i have a male cat. the color of male cat is Black";
 int found  = 0;
 int oldndx = 0;
 int newndx = 0;

 while ( (newndx=str.indexOf("male cat", oldndx)) > -1 )
 {
     found++;
     oldndx = newndx+8;
 }

答案 21 :(得分:0)

如果您找到要搜索的字符串,则可以继续查找该字符串的长度(如果您在aaaa中搜索aa,则将其视为2次)。

int c=0;
String found="male cat";
 for(int j=0; j<text.length();j++){
     if(text.contains(found)){
         c+=1;
         j+=found.length()-1;
     }
 }
 System.out.println("counter="+c);

答案 22 :(得分:0)

字符串在循环遍历时始终包含该字符串。你不想要++,因为现在正在做的只是获取字符串的长度,如果它包含&#34; &#34;雄猫&#34;

您需要indexOf()/ substring()

有点得到我说的话吗?