以字符串形式打印所有Digrams及其频率

时间:2017-04-30 22:05:19

标签: java string user-input

我试图编写一个程序来读取一串文本并打印本文中的所有数字及其频率。 digram是两个字符的序列。该程序打印基于频率排序的数字(按降序排列) 顺序)。

输入示例:park car at the parking lot

对应输出:ar:3 pa:2 rk:2 at:1 ca:1 he:1 in:1 ki:1 lo:1 ng:1 ot:1 th:1

我有这个实现,但它只适用于字符串中的每个字符。我将如何为每个digram实现这个?

import java.util.Scanner;

public class Digrams {
  public static void main(String args[]) {
    int ci, i, j, k, l=0;
    String str, str1;
    char c, ch;
    Scanner scan = new Scanner(System.in);

    System.out.print("Enter a String : ");
    str=scan.nextLine();

    i=str.length();
    for(c='A'; c<='z'; c++)
    {
        k=0;
        for(j=0; j<i; j++)
        {
            ch = str.charAt(j);
            if(ch == c)
            {
                k++;
            }
        }
        if(k>0)
        {
            System.out.println("" +c +": " +k);
        }
    }
  }
}

4 个答案:

答案 0 :(得分:1)

这样做的方法是检查每两个字母的组合,然后寻找那些。您可以通过使用双循环来完成此操作,如下所示:

let res = await fetch('http://example.com');
let data = await res.arrayBuffer();

这也意味着你必须比较字符串,而不是比较字符。这当然意味着需要使用public static void main(String args[]) { int ci, i, j, k, l=0; String str, str1, result, subString; char c1, c2, ch; Scanner scan = new Scanner(System.in); System.out.print("Enter a String : "); str=scan.nextLine(); i=str.length(); for(c1='A'; c1<='z'; c1++) { for(c2='A'; c2<='z'; c2++) { result = new String(new char[]{c1, c2}); k = 0; for (j = 0; j < i-1; j++) { subString = str.substring(j, j+2); if (result.equals(subString)) { k++; } } if (k > 0) { System.out.println("" + result + ": " + k); } } } } 函数,而不是.equals()运算符,因为String是Java中的对象。

我的结果是:

==

答案 1 :(得分:1)

这应该有所帮助:

    public static void main(String[] args) {
        Scanner scan = new Scanner(System.in);

        System.out.print("Enter a String : ");
        String str = scan.nextLine();

        ArrayList<String> repetition = new ArrayList<String>();
        ArrayList<String> digrams = new ArrayList<String>();
        String digram;

        for(int i = 0; i < str.length() - 1; i++) {
            digram = str.substring(i, i + 2);
            if(repetition.contains(digram) || digram.contains(" ") || digram.length() < 2)
                continue;
            int occurances = (str.length() - str.replace(digram, "").length()) / 2;
            occurances += (str.replaceFirst(".*?(" + digram.charAt(0) + "+).*", "$1").length() - 1) / 2;
            digrams.add(digram + ":" + occurances);
            repetition.add(digram);
        }

        Collections.sort(digrams, (s1, s2) -> s1.substring(3, 4).compareTo(s2.substring(3, 4)));

        System.out.println(digrams);
}

如果您不想使用jdk8,请告诉我。

答案 2 :(得分:1)

以下是您如何在一行中完成的工作:

Map<String, Long> digramFrequencies = Arrays
    .stream(str
        .replaceAll("(?<!^| ).(?! |$)", "$0$0") // double letters
        .split(" |(?<=\\G..)")) // split into digrams 
    .filter(s -> s.length() > 1) // discard short terms
    .collect(Collectors.groupingBy(s -> s, Collectors.counting()));

请参阅live demo

这适用于:

  • 将不在单词开头/结尾的所有字母加倍,例如"abc defg"变为"abbc deeffg"
  • 分成对,在单词开头重新开始分割
  • 丢弃短期条款(例如“I”和“a”等字样)
  • 计算频率

答案 3 :(得分:1)

我知道你已经得到了完美的答案并且比这更好,但是我想知道如果没有Collections类的帮助我是否可以按降序排序结果,它可能会有所帮助或者是一个新想法。

import java.util.ArrayList;
import java.util.Scanner;


public class Digrams{

    public static void main(String[] args){
        Scanner in = new Scanner(System.in);
        System.out.println("Insert The Sentence");
        String []sentence =  in.nextLine().split(" "); // split the input according to the spaces and put them in array

        //get all digrams
        ArrayList<String> allDigrams = new ArrayList<String>(); // ArrayList to contain all possible digrams
        for(int i=0; i<sentence.length; i++){ // do that for every word     
            for(int j=0; j<sentence[i].length(); j++){ // cycle through each char at each index in the sentence array
                String oneDigram= "";
                if(j<sentence[i].length()-1){
                    oneDigram += sentence[i].charAt(j); // append the char and the following char
                    oneDigram += sentence[i].charAt(j+1);
                    allDigrams.add(oneDigram); // add the one diagram to the ArrayList
                }
            }
        }

        // isolate digrams and get corresponding frequencies
        ArrayList<Integer> frequency = new ArrayList<Integer>(); // for frequencies
        ArrayList<String>  digrams = new ArrayList<String>(); //for digrams
        int freqIndex=0;
        while(allDigrams.size()>0){ 
            frequency.add(freqIndex,0);
            for(int j=0; j<allDigrams.size(); j++){ // compare each UNIQUE digram with the rest of the digrams to find repetition
                if(allDigrams.get(0).equalsIgnoreCase(allDigrams.get(j))){
                    frequency.set(freqIndex, frequency.get(freqIndex)+1); // increment frequency    
                }
            }
            String dig = allDigrams.get(0); // record the digram temporarily
            while(allDigrams.contains(dig)){ // now remove all repetition from the allDigrams ArrayList
                allDigrams.remove(dig);
            }
            digrams.add(dig); // add the UNIQUE digram
            freqIndex++; // move to next index for the following digram 
        }


        // sort result in descending order
        // compare the frequency , if equal -> the first char of digram, if equal -> the second char of digram
        // and move frequencies and digrams at every index in each ArrayList accordingly
        for (int i = 0 ; i < frequency.size(); i++){
            for (int j = 0 ; j < frequency.size() - i - 1; j++){
                if (frequency.get(j) < frequency.get(j+1) || 
                      ((frequency.get(j) == frequency.get(j+1)) && (digrams.get(j).charAt(0) > digrams.get(j+1).charAt(0))) ||
                        ((digrams.get(j).charAt(0) == digrams.get(j+1).charAt(0)) && (digrams.get(j).charAt(1) > digrams.get(j+1).charAt(1)))){ 
                    int swap  = frequency.get(j);
                    String swapS = digrams.get(j);
                    frequency.set(j, frequency.get(j+1));
                    frequency.set(j+1, swap);
                    digrams.set(j, digrams.get(j+1));
                    digrams.set(j+1, swapS);
                }
            }
        }


         //final result
         String sortedResult="";
         for(int i=0; i<frequency.size(); i++){
             sortedResult+=digrams.get(i) + ":" + frequency.get(i) + " ";
         }

         System.out.println(sortedResult);

    }

}

<强>输入

park car at the parking lot

<强>输出

ar:3 pa:2 rk:2 at:1 ca:1 he:1 in:1 ki:1 lo:1 ng:1 ot:1 th:1