在字符串上查找重复的单词并计算重复次数

时间:2011-01-27 19:14:36

标签: java string repeat

我需要在字符串上找到重复的单词,然后计算它们被重复的次数。所以基本上,如果输入字符串是这样的:

String s = "House, House, House, Dog, Dog, Dog, Dog";

我需要创建一个没有重复的新字符串列表,并在其他地方保存每个单词的重复次数,如:

新字符串:“House,Dog”

New Int Array:[3,4]

有没有办法用Java轻松完成这项工作?我已经设法使用s.split()分隔字符串但是我如何计算重复并在新字符串上消除它们?谢谢!

29 个答案:

答案 0 :(得分:21)

你已经完成了艰苦的工作。现在,您只需使用Map来计算出现次数:

Map<String, Integer> occurrences = new HashMap<String, Integer>();

for ( String word : splitWords ) {
   Integer oldCount = occurrences.get(word);
   if ( oldCount == null ) {
      oldCount = 0;
   }
   occurrences.put(word, oldCount + 1);
}

使用map.get(word)会告诉您多次出现一个单词。您可以通过迭代map.keySet()

来构建新列表
for ( String word : occurrences.keySet() ) {
  //do something with word
}

请注意,keySet获得的顺序是任意的。如果您需要按照首次出现在输入字符串中的字词进行排序,则应使用LinkedHashMap代替。

答案 1 :(得分:3)

正如其他人所提到的,使用String :: split(),然后是一些map(hashmap或linkedhashmap),然后合并你的结果。为了完整起见,请放置代码。

import java.util.*;

public class Genric<E>
{
    public static void main(String[] args) 
    {
        Map<String, Integer> unique = new LinkedHashMap<String, Integer>();
        for (String string : "House, House, House, Dog, Dog, Dog, Dog".split(", ")) {
            if(unique.get(string) == null)
                unique.put(string, 1);
            else
                unique.put(string, unique.get(string) + 1);
        }
        String uniqueString = join(unique.keySet(), ", ");
        List<Integer> value = new ArrayList<Integer>(unique.values());

        System.out.println("Output = " + uniqueString);
        System.out.println("Values = " + value);

    }

    public static String join(Collection<String> s, String delimiter) {
        StringBuffer buffer = new StringBuffer();
        Iterator<String> iter = s.iterator();
        while (iter.hasNext()) {
            buffer.append(iter.next());
            if (iter.hasNext()) {
                buffer.append(delimiter);
            }
        }
        return buffer.toString();
    }
}

新字符串为Output = House, Dog

Int数组(或更确切地说是列表)Values = [3, 4](可以使用List :: toArray)来获取数组。

答案 2 :(得分:3)

试试这个,

public class DuplicateWordSearcher {
@SuppressWarnings("unchecked")
public static void main(String[] args) {

    String text = "a r b k c d se f g a d f s s f d s ft gh f ws w f v x s g h d h j j k f sd j e wed a d f";

    List<String> list = Arrays.asList(text.split(" "));

    Set<String> uniqueWords = new HashSet<String>(list);
    for (String word : uniqueWords) {
        System.out.println(word + ": " + Collections.frequency(list, word));
    }
}

}

答案 3 :(得分:2)

public class StringsCount{

    public static void main(String args[]) {

        String value = "This is testing Program testing Program";

        String item[] = value.split(" ");

        HashMap<String, Integer> map = new HashMap<>();

        for (String t : item) {
            if (map.containsKey(t)) {
                map.put(t, map.get(t) + 1);

            } else {
                map.put(t, 1);
            }
        }
        Set<String> keys = map.keySet();
        for (String key : keys) {
            System.out.println(key);
            System.out.println(map.get(key));
        }

    }
}

答案 4 :(得分:1)

使用java8

private static void findWords(String s, List<String> output, List<Integer> count){
    String[] words = s.split(", ");
    Map<String, Integer> map = new LinkedHashMap<>();
    Arrays.stream(words).forEach(e->map.put(e, map.getOrDefault(e, 0) + 1));
    map.forEach((k,v)->{
        output.add(k);
        count.add(v);
    });
}

此外,如果您想保留插入顺序,请使用LinkedHashMap

private static void findWords(){
    String s = "House, House, House, Dog, Dog, Dog, Dog";
    List<String> output = new ArrayList<>();
    List<Integer> count = new ArrayList<>();
    findWords(s, output, count);
    System.out.println(output);
    System.out.println(count);
}

输出

[House, Dog]
[3, 4]

答案 5 :(得分:1)

它可能会以某种方式帮助你。

String st="I am am not the one who is thinking I one thing at time";
String []ar = st.split("\\s");
Map<String, Integer> mp= new HashMap<String, Integer>();
int count=0;

for(int i=0;i<ar.length;i++){
    count=0;

    for(int j=0;j<ar.length;j++){
        if(ar[i].equals(ar[j])){
        count++;                
        }
    }

    mp.put(ar[i], count);
}

System.out.println(mp);

答案 6 :(得分:0)

package string;

import java.util.HashMap;
import java.util.Map;
import java.util.Set;

public class DublicatewordinanArray {
public static void main(String[] args) {
String str = "This is Dileep Dileep Kumar Verma Verma";
DuplicateString(str);
    }
public static void DuplicateString(String str) {
String word[] = str.split(" ");
Map < String, Integer > map = new HashMap < String, Integer > ();
for (String w: word)
if (!map.containsKey(w)) {
map.put(w, 1);
    }
else {
map.put(w, map.get(w) + 1);
        }
Set < Map.Entry < String, Integer >> entrySet = map.entrySet();
 for (Map.Entry < String, Integer > entry: entrySet)
if (entry.getValue() > 1) {
 System.out.printf("%s : %d %n", entry.getKey(), entry.getValue());
}
 }
}

答案 7 :(得分:0)

//program to find number of repeating characters in a string
//Developed by Rahul Lakhmara

import java.util.*;

public class CountWordsInString {
    public static void main(String[] args) {
        String original = "I am rahul am i sunil so i can say am i";
        // making String type of array
        String[] originalSplit = original.split(" ");
        // if word has only one occurrence
        int count = 1;
        // LinkedHashMap will store the word as key and number of occurrence as
        // value
        Map<String, Integer> wordMap = new LinkedHashMap<String, Integer>();

        for (int i = 0; i < originalSplit.length - 1; i++) {
            for (int j = i + 1; j < originalSplit.length; j++) {
                if (originalSplit[i].equals(originalSplit[j])) {
                    // Increment in count, it will count how many time word
                    // occurred
                    count++;
                }
            }
            // if word is already present so we will not add in Map
            if (wordMap.containsKey(originalSplit[i])) {
                count = 1;
            } else {
                wordMap.put(originalSplit[i], count);
                count = 1;
            }
        }

        Set word = wordMap.entrySet();
        Iterator itr = word.iterator();
        while (itr.hasNext()) {
            Map.Entry map = (Map.Entry) itr.next();
            // Printing
            System.out.println(map.getKey() + " " + map.getValue());
        }
    }
}

答案 8 :(得分:0)

    public static void main(String[] args){
    String string = "elamparuthi, elam, elamparuthi";
    String[] s = string.replace(" ", "").split(",");
    String[] op;
    String ops = "";

    for(int i=0; i<=s.length-1; i++){
        if(!ops.contains(s[i]+"")){
            if(ops != "")ops+=", "; 
            ops+=s[i];
        }

    }
    System.out.println(ops);
}

答案 9 :(得分:0)

对于没有空格的字符串,我们可以使用下面提到的代码

M[, idx]

传递一些输入作为&#34; haha​​ha&#34;或者&#34; ba na na&#34;或&#34; xxxyyyzzzxxxzzz&#34;给出所需的输出。

答案 10 :(得分:0)

一旦您从字符串中得到单词,这很容易。 从Java 10开始,您可以尝试以下代码:

import java.util.Arrays;
import java.util.stream.Collectors;

public class StringFrequencyMap {
    public static void main(String... args) {
        String[] wordArray = {"House", "House", "House", "Dog", "Dog", "Dog", "Dog"};
        var freq = Arrays.stream(wordArray)
                         .collect(Collectors.groupingBy(x -> x, Collectors.counting()));
        System.out.println(freq);
    }
}

输出:

{House=3, Dog=4}

答案 11 :(得分:0)

希望这会有所帮助:

public static int countOfStringInAText(String stringToBeSearched, String masterString){

    int count = 0;
    while (masterString.indexOf(stringToBeSearched)>=0){
      count = count + 1;
      masterString = masterString.substring(masterString.indexOf(stringToBeSearched)+1);
    }
    return count;
}

答案 12 :(得分:0)

请使用以下代码。根据我的分析,它是最简单的。希望你会喜欢它:

import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Scanner;
import java.util.Set;

public class MostRepeatingWord {

    String mostRepeatedWord(String s){
        String[] splitted = s.split(" ");
        List<String> listString = Arrays.asList(splitted);
        Set<String> setString = new HashSet<String>(listString);
        int count = 0;
        int maxCount = 1;
        String maxRepeated = null;
        for(String inp: setString){
            count = Collections.frequency(listString, inp);
            if(count > maxCount){
                maxCount = count;
                maxRepeated = inp;
            }
        }
        return maxRepeated;
    }
    public static void main(String[] args) 
    {       
        System.out.println("Enter The Sentence: ");
        Scanner s = new Scanner(System.in);
        String input = s.nextLine();
        MostRepeatingWord mrw = new MostRepeatingWord();
        System.out.println("Most repeated word is: " + mrw.mostRepeatedWord(input));

    }
}

答案 13 :(得分:0)

使用Java 8流collectors

public static Map<String, Integer> countRepetitions(String str) {
    return Arrays.stream(str.split(", "))
        .collect(Collectors.toMap(s -> s, s -> 1, (a, b) -> a + 1));
}

输入:"House, House, House, Dog, Dog, Dog, Dog, Cat"

输出:{Cat=1, House=3, Dog=4}

答案 14 :(得分:0)

使用Collectors.groupingBy内部的Function.identity()并将所有内容存储在MAP中。

String a  = "Gini Gina Gina Gina Gina Protijayi Protijayi "; 
        Map<String, Long> map11 = Arrays.stream(a.split(" ")).collect(Collectors
                .groupingBy(Function.identity(),Collectors.counting()));
        System.out.println(map11);
// output => {Gina=4, Gini=1, Protijayi=2}

答案 15 :(得分:0)

请尝试这些可能对您有帮助。

UITabBarController

答案 16 :(得分:0)

由于流的引入改变了我们的编码方式;我想添加一些使用它的方法

    String[] strArray = str.split(" ");
    
    //1. All string value with their occurrences
    Map<String, Long> counterMap = 
            Arrays.stream(strArray).collect(Collectors.groupingBy(e->e, Collectors.counting()));

    //2. only duplicating Strings
    Map<String, Long> temp = counterMap.entrySet().stream().filter(map->map.getValue() > 1).collect(Collectors.toMap(map -> map.getKey(), map -> map.getValue()));
    System.out.println("test : "+temp);
    
    //3. List of Duplicating Strings
    List<String> masterStrings = Arrays.asList(strArray);
    Set<String> duplicatingStrings = 
            masterStrings.stream().filter(i -> Collections.frequency(masterStrings, i) > 1).collect(Collectors.toSet());

答案 17 :(得分:0)

public class Counter {

private static final int COMMA_AND_SPACE_PLACE = 2;

private String mTextToCount;
private ArrayList<String> mSeparateWordsList;

public Counter(String mTextToCount) {
    this.mTextToCount = mTextToCount;

    mSeparateWordsList = cutStringIntoSeparateWords(mTextToCount);
}

private ArrayList<String> cutStringIntoSeparateWords(String text)
{
    ArrayList<String> returnedArrayList = new ArrayList<>();


    if(text.indexOf(',') == -1)
    {
        returnedArrayList.add(text);
        return returnedArrayList;
    }

    int position1 = 0;
    int position2 = 0;

    while(position2 < text.length())
    {
        char c = ',';
        if(text.toCharArray()[position2] == c)
        {
            String tmp = text.substring(position1, position2);
            position1 += tmp.length() + COMMA_AND_SPACE_PLACE;
            returnedArrayList.add(tmp);
        }
        position2++;
    }

    if(position1 < position2)
    {
        returnedArrayList.add(text.substring(position1, position2));
    }

    return returnedArrayList;
}

public int[] countWords()
{
    if(mSeparateWordsList == null) return null;


    HashMap<String, Integer> wordsMap = new HashMap<>();

    for(String s: mSeparateWordsList)
    {
        int cnt;

        if(wordsMap.containsKey(s))
        {
            cnt = wordsMap.get(s);
            cnt++;
        } else {
            cnt = 1;
        }
        wordsMap.put(s, cnt);
    }                
    return printCounterResults(wordsMap);
}

private int[] printCounterResults(HashMap<String, Integer> m)
{        
    int index = 0;
    int[] returnedIntArray = new int[m.size()];

    for(int i: m.values())
    {
        returnedIntArray[index] = i;
        index++;
    }

    return returnedIntArray;

}

}

答案 18 :(得分:0)

package day2;

import java.util.ArrayList;
import java.util.HashMap;`enter code here`
import java.util.List;

public class DuplicateWords {

    public static void main(String[] args) {
        String S1 = "House, House, House, Dog, Dog, Dog, Dog";
        String S2 = S1.toLowerCase();
        String[] S3 = S2.split("\\s");

        List<String> a1 = new ArrayList<String>();
        HashMap<String, Integer> hm = new HashMap<>();

        for (int i = 0; i < S3.length - 1; i++) {

            if(!a1.contains(S3[i]))
            {
                a1.add(S3[i]);
            }
            else
            {
                continue;
            }

            int Count = 0;

            for (int j = 0; j < S3.length - 1; j++)
            {
                if(S3[j].equals(S3[i]))
                {
                    Count++;
                }
            }

            hm.put(S3[i], Count);
        }

        System.out.println("Duplicate Words and their number of occurrences in String S1 : " + hm);
    }
}

答案 19 :(得分:0)

import java.util.HashMap;
import java.util.Scanner;
public class class1 {
public static void main(String[] args) {
    Scanner in = new Scanner(System.in);
    String inpStr = in.nextLine();
    int key;

    HashMap<String,Integer> hm = new HashMap<String,Integer>();
    String[] strArr = inpStr.split(" ");

    for(int i=0;i<strArr.length;i++){
        if(hm.containsKey(strArr[i])){
            key = hm.get(strArr[i]);
            hm.put(strArr[i],key+1);

        }
        else{
            hm.put(strArr[i],1);
        }   
    }
    System.out.println(hm);
}

}

答案 20 :(得分:0)

import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;

public class DuplicateWord {

    public static void main(String[] args) {
        String para = "this is what it is this is what it can be";
        List < String > paraList = new ArrayList < String > ();
        paraList = Arrays.asList(para.split(" "));
        System.out.println(paraList);
        int size = paraList.size();

        int i = 0;
        Map < String, Integer > duplicatCountMap = new HashMap < String, Integer > ();
        for (int j = 0; size > j; j++) {
            int count = 0;
            for (i = 0; size > i; i++) {
                if (paraList.get(j).equals(paraList.get(i))) {
                    count++;
                    duplicatCountMap.put(paraList.get(j), count);
                }

            }

        }
        System.out.println(duplicatCountMap);
        List < Integer > myCountList = new ArrayList < > ();
        Set < String > myValueSet = new HashSet < > ();
        for (Map.Entry < String, Integer > entry: duplicatCountMap.entrySet()) {
            myCountList.add(entry.getValue());
            myValueSet.add(entry.getKey());
        }
        System.out.println(myCountList);
        System.out.println(myValueSet);
    }

}

输入:这就是它可以是什么

输出

[this,is,what,it,is,this,is,what,it,can,是]

{can = 1,what = 2,be = 1,this = 2,is = 3,it = 2}

[1,2,1,2,3,2]

[can,what,be,this,is,it]

答案 21 :(得分:0)

我希望这会对你有所帮助

public void countInPara(String str){

    Map<Integer,String> strMap = new HashMap<Integer,String>();
    List<String> paraWords = Arrays.asList(str.split(" "));
    Set<String> strSet = new LinkedHashSet<>(paraWords);
    int count;

    for(String word : strSet) {
        count = Collections.frequency(paraWords, word);
        strMap.put(count, strMap.get(count)==null ? word : strMap.get(count).concat(","+word));
    }

    for(Map.Entry<Integer,String> entry : strMap.entrySet())
        System.out.println(entry.getKey() +" :: "+ entry.getValue());
}

答案 22 :(得分:0)

import java.util.HashMap;
import java.util.LinkedHashMap;

public class CountRepeatedWords {

    public static void main(String[] args) {
          countRepeatedWords("Note that the order of what you get out of keySet is arbitrary. If you need the words to be sorted by when they first appear in your input String, you should use a LinkedHashMap instead.");
    }

    public static void countRepeatedWords(String wordToFind) {
        String[] words = wordToFind.split(" ");
        HashMap<String, Integer> wordMap = new LinkedHashMap<String, Integer>();

        for (String word : words) {
            wordMap.put(word,
                (wordMap.get(word) == null ? 1 : (wordMap.get(word) + 1)));
        }

            System.out.println(wordMap);
    }

}

答案 23 :(得分:0)

如果传递一个String参数,它将计算每个单词的重复次数

/**
 * @param string
 * @return map which contain the word and value as the no of repatation
 */
public Map findDuplicateString(String str) {
    String[] stringArrays = str.split(" ");
    Map<String, Integer> map = new HashMap<String, Integer>();
    Set<String> words = new HashSet<String>(Arrays.asList(stringArrays));
    int count = 0;
    for (String word : words) {
        for (String temp : stringArrays) {
            if (word.equals(temp)) {
                ++count;
            }
        }
        map.put(word, count);
        count = 0;
    }

    return map;

}

输出:

 Word1=2, word2=4, word2=1,. . .

答案 24 :(得分:0)

public static void main(String[] args) {
    String s="sdf sdfsdfsd sdfsdfsd sdfsdfsd sdf sdf sdf ";
    String st[]=s.split(" ");
    System.out.println(st.length);
    Map<String, Integer> mp= new TreeMap<String, Integer>();
    for(int i=0;i<st.length;i++){

        Integer count=mp.get(st[i]);
        if(count == null){
            count=0;
        }           
        mp.put(st[i],++count);
    }
   System.out.println(mp.size());
   System.out.println(mp.get("sdfsdfsd"));


}

答案 25 :(得分:0)

//program to find number of repeating characters in a string
//Developed by Subash<subash_senapati@ymail.com>


import java.util.Scanner;

public class NoOfRepeatedChar

{

   public static void main(String []args)

   {

//input through key board

Scanner sc = new Scanner(System.in);

System.out.println("Enter a string :");

String s1= sc.nextLine();


    //formatting String to char array

    String s2=s1.replace(" ","");
    char [] ch=s2.toCharArray();

    int counter=0;

    //for-loop tocompare first character with the whole character array

    for(int i=0;i<ch.length;i++)
    {
        int count=0;

        for(int j=0;j<ch.length;j++)
        {
             if(ch[i]==ch[j])
                count++; //if character is matching with others
        }
        if(count>1)
        {
            boolean flag=false;

            //for-loop to check whether the character is already refferenced or not 
            for (int k=i-1;k>=0 ;k-- )
            {
                if(ch[i] == ch[k] ) //if the character is already refferenced
                    flag=true;
            }
            if( !flag ) //if(flag==false) 
                counter=counter+1;
        }
    }
    if(counter > 0) //if there is/are any repeating characters
            System.out.println("Number of repeating charcters in the given string is/are " +counter);
    else
            System.out.println("Sorry there is/are no repeating charcters in the given string");
    }
}

答案 26 :(得分:0)

您可以使用前缀树(trie)数据结构来存储单词并跟踪前缀树节点中单词的数量。

  #define  ALPHABET_SIZE 26
  // Structure of each node of prefix tree
  struct prefix_tree_node {
    prefix_tree_node() : count(0) {}
    int count;
    prefix_tree_node *child[ALPHABET_SIZE];
  };
  void insert_string_in_prefix_tree(string word)
  {
    prefix_tree_node *current = root;
    for(unsigned int i=0;i<word.size();++i){
      // Assuming it has only alphabetic lowercase characters
            // Note ::::: Change this check or convert into lower case
    const unsigned int letter = static_cast<int>(word[i] - 'a');

      // Invalid alphabetic character, then continue
      // Note :::: Change this condition depending on the scenario
      if(letter > 26)
        throw runtime_error("Invalid alphabetic character");

      if(current->child[letter] == NULL)
        current->child[letter] = new prefix_tree_node();

      current = current->child[letter];
    }
  current->count++;
  // Insert this string into Max Heap and sort them by counts
}

    // Data structure for storing in Heap will be something like this
    struct MaxHeapNode {
       int count;
       string word;
    };

插入所有单词后,您必须通过迭代Maxheap来打印单词和计数。

答案 27 :(得分:0)

/*count no of Word in String using TreeMap we can use HashMap also but word will not display in sorted order */

import java.util.*;

public class Genric3
{
    public static void main(String[] args) 
    {
        Map<String, Integer> unique = new TreeMap<String, Integer>();
        String string1="Ram:Ram: Dog: Dog: Dog: Dog:leela:leela:house:house:shayam";
        String string2[]=string1.split(":");

        for (int i=0; i<string2.length; i++)
        {
            String string=string2[i];
            unique.put(string,(unique.get(string) == null?1:(unique.get(string)+1)));
        }

        System.out.println(unique);
    }
}      

答案 28 :(得分:0)

如果这是一个家庭作业,那么我只能说:使用String.split()HashMap<String,Integer>

(我看到你已经找到了split()。那么你就是正确的。)