查找字母表中每个字母出现次数最多的单词

时间:2013-11-23 13:38:02

标签: java arrays iterator

我编写了一个程序,用于从文本文件中读取,每行都是一个单词。代码的第一部分找到以字母表的每个字母开头的最长单词,并将其存储在数组中。我希望程序要做的第二部分是,对于字母表中的每个字母,找到该字母出现次数最多的单词。

所以输出应该是这样的:

最长的词:

a: anthropomorphologically
b: blepharosphincterectomy
c: cholecystenterorrhaphy
d: dacryocystoblennorrhea
e: epididymodeferentectomy
f: formaldehydesulphoxylate
g: gastroenteroanastomosis
h: hematospectrophotometer
i: indistinguishableness
j: jurisprudentialist
k: keratoconjunctivitis
l: laparocolpohysterotomy
m: macracanthrorhynchiasis
n: naphthylaminesulphonic
o: omnirepresentativeness
p: pathologicopsychological
q: quadratomandibular
r: reticulatocoalescent
s: scientificophilosophical
t: tetraiodophenolphthalein
u: ureterocystanastomosis
v: vagoglossopharyngeal
w: weatherproofness
x: xanthocreatinine
y: yohimbinization
z: zoologicoarchaeologist

大多数字母的单词:

a: astragalocalcaneal
b: beblubber
c: chlorococcaceae
d: disdodecahedroid
e: electrotelethermometer
f: giffgaff
g: cuggermugger
h: choledochorrhaphy
i: impossibilification
j: ajaja
k: akiskemikinik
l: allochlorophyll
m: dynamometamorphism
n: nonannouncement
o: choledochoduodenostomy
p: aplopappus
q: equivoque
r: archcorrupter
s: possessionlessness
t: anticonstitutionalist
u: untumultuous
v: overconservative
w: bowwow
x: adnexopexy
y: dacryocystosyringotomy
z: zizz

}

基本上,我需要弄清楚如何做到这一点,因此输出不是字母大小与第一个字母相同的字(就像上面的'f'[giffgaff]如何不以'f'开头)。我已经google / bing了很多,没有找到任何帮助。

/**
 * @param args first String argument is the
 *        name of the input text file
 */
public static void main(String [] args) throws IOException {

    //instance variable
    String[] longestWords = new String[26];
    String[] mostCharsWord = new String[26];
    String currentLine = null;

    int[] numCharacters = new int[26];

    //because the while loop in try statement is comparing lengths in order to
    //assign words, I must give each element a non-null value
    //in this case, length = 0
    Arrays.fill(longestWords, "");
    Arrays.fill(mostCharsWord, "");

    //try block
    try(BufferedReader br = new BufferedReader(new FileReader(args[0]))) {
        String currentLongestWord;
        int index;
        int indexer = 0;
        int count = 0;
        int counter = 0;

        while((currentLine=br.readLine()) != null) {
            currentLine = currentLine.toLowerCase();
            index = currentLine.charAt(0)-'a';
            currentLongestWord = longestWords[index];
            if(currentLine.length() > currentLongestWord.length()) {
                longestWords[index] = currentLine;
            }

            /**
             * this code below is for the "AND" bit, but I know that it's not correct.
             * Instead of printing out the word with the most occurrences of each
             * letter, it prints out the word with the most occurrences of each letter
             * THAT BEGINS WITH THAT LETTER
             */

            for(char c : currentLine.toCharArray()) {
                if(c == currentLine.charAt(0)) {
                    count += 1;
                }
            }

            for(String currentMostCharsWord : mostCharsWord) {
                indexer += 1;
                for(char c : currentLine.toCharArray()) {
                    for( char d: currentMostCharsWord.toCharArray()) {
                        if(c==d) {
                            //hmmm....this would compare every letter, not just the one
                            //that I'm looking for. booooooo
                        }
                    }
                }
            }

            if(count > numCharacters[index]) {
                numCharacters[index] = count;
                mostCharsWord[index] = currentLine;
            }

            count = 0;
        }

        //close file!
        br.close();
    }

    //catch block
    catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

    //finally / do anyway statement
    finally {
        System.out.println("Longest Words: \n");
        for(int j = 0; j < 26; j++) {
            System.out.printf("%s: %s\n", longestWords[j].charAt(0), longestWords[j]);
        }

        System.out.println("------------------------------------\n\nWords with most letters: \n");
        for(int j = 0; j < 26; j++) {
            System.out.printf("%s: %s\n", mostCharsWord[j].charAt(0), mostCharsWord[j]);
        }
    }   
}

}

2 个答案:

答案 0 :(得分:0)

可能有一种更直接的方法。所以问题本质上是你有一个单词流,并根据当前从流中读取的单词,你将它与你在数据存储中已知的最长单词进行比较。如果它更长,你替换这个词,否则什么都不做。您的逻辑可能是基于其他内容替换它,例如单词的字典顺序。检查区分大小写是一种练习。

//主要是伪代码

public class LongestWord  
{  
    Map<Character,String> longestWords = new HashMap<Character,String>();  

    while(wordsStream.hasNext())  
    {    
         String currentWord = wordStream.next();  
         String longestWordByLetter = longestWords.get(currentWord.charAt(0));  
         if(null != longestWordByLetter)  
         {   
            if(longestWordByLetter.size() < currentWord.size())  
            {  
                 longestWords.put(currentWord.charAt(0),currentWord);
            }//else do nothing      
         }else{  
              longestWords.put(currentWord.charAt(0),currentWord);  
         }  
    }  

}  

答案 1 :(得分:0)

您可以使用以下内容:

// Map with the longest word for each letter
Map<Character, String> longestWordMap = new HashMap<Character, String>();
// Map with the word with highest occurrences of each letter
Map<Character, String> mostCharsWordMap = new HashMap<Character, String>();

while((word = br.readLine()) != null) { {
    word = word.toLowerCase();
    Character beginning = word.charAt(0);
    String longestWord = longestWordMap.get(beginning);
    // If the current word is the longest, put the word in the map
    if (longestWord == null || word.length() > longestWord.length()) {
            longestWordMap.put(beginning, word);
    }
    for (char letter = 'a'; letter <= 'z'; letter++) {
        String mostCharsWord = mostCharsWordMap.get(Character.valueOf(letter));
        if (mostCharsWord == null || 
            characterCount(letter, word) > characterCount(letter, mostCharsWord)) {
            mostCharsWordMap.put(Character.valueOf(letter), word);
        }
    }
}

这是用于计算单词中字母出现次数的函数:

public static int characterCount(char letter, String word) {
    int characterCount = 0;
    for (char c : word.toCharArray()) {
        if (c == letter) {
            characterCount++;
        }
    }
    return characterCount;
}