错误返回句子中的单词数

时间:2013-10-27 13:16:48

标签: java

我在代码中返回句子中的单词数时出错。我有一个变量来保存句子分隔符。我正在阅读的文本文件包含在下面。 谢谢你的帮助。

正在阅读的文字:“一个!!!!!两个!!!!!三个:巴巴?噢!布布和贝贝。”

WORD_DELIMETERS =“。:;?!,''”;

我收到的输出:

文件中有9个单词。

文件中有14个元音。

文件中有6个句子。

它应该返回8个单词而不是9个单词,句子和元音是正确的。

//START of count the words********************************************
int wordCounter= 0;
int last_Index=0;
for(int i=0;i<myFile.length()-1;i++){
     for(int j=0;j<WORD_DELIMETERS.length()-1;j++){
        if(myFile.charAt(i)==WORD_DELIMETERS.charAt(j)){
                if(myFile.charAt(i+1) !=' '){
                    if(last_Index!=i-1){
                        wordCounter++;
                    }
                    last_Index=i;

                }
            } 
       } 
}
// END of count the words***********************************************    

2 个答案:

答案 0 :(得分:0)

你不算第一个字。因此,如果字符串不为空,则必须从1开始计数。

我用它:

    String myFile = "one!!!!! two!!!!! three: Baba? Oho! Bubu and bebe.";
    String pattern ="[.|:|;|?|!|,|'| ]";
    int counter = 0;
    for(String word : myFile.split(pattern)) {
        if(word.length()!=0)counter++;
    }
    System.out.println("Words: "+counter); //print Words: 8

您可以将计数器编辑为:

public static int yourCounter(String myFile) {
    if(myFile.length()==0)return 0;
    String WORD_DELIMETERS = ".:;?!,' ]";
    int wordCounter= WORD_DELIMETERS.contains(myFile.charAt(0)+"")?0:1;
    int last_Index=0;
    for(int i=0;i<myFile.length()-1;i++){
         for(int j=0;j<WORD_DELIMETERS.length()-1;j++){
            if(myFile.charAt(i)==WORD_DELIMETERS.charAt(j)){
                    if(myFile.charAt(i+1) !=' '){
                        if(last_Index!=i-1){
                            wordCounter++;
                        }
                        last_Index=i;

                    }
                } 
           } 
    }
    return wordCounter;
}

我使用分割和正则表达式。往下看: 长文本示例:

public static void main(String[] args) throws Throwable {
    String text = "Java is a computer programming langua"
            + "ge that is concurrent, class-based, objec"
            + "t-oriented, and specifically designed to "
            + "have as few implementation dependencies a"
            + "s possible. It is intended to let applica"
            + "tion developers \"write once, run anywher"
            + "e\" (WORA), meaning that code that runs o"
            + "n one platform does not need to be recomp"
            + "iled to run on another. Java applications"
            + " are typically compiled to bytecode (clas"
            + "s file) that can run on any Java virtual "
            + "machine (JVM) regardless of computer arch"
            + "itecture. Java is, as of 2012, one of the"
            + " most popular programming languages in us"
            + "e, particularly for client-server web app"
            + "lications, with a reported 9 million deve"
            + "lopers.[10][11] Java was originally devel"
            + "oped by James Gosling at Sun Microsystems"
            + " (which has since merged into Oracle Corp"
            + "oration) and released in 1995 as a core c"
            + "omponent of Sun Microsystems' Java platfo"
            + "rm. The language derives much of its synt"
            + "ax from C and C++, but it has fewer low-l"
            + "evel facilities than either of them. The "
            + "original and reference implementation Jav"
            + "a compilers, virtual machines, and class "
            + "libraries were developed by Sun from 1991"
            + " and first released in 1995. As of May 20"
            + "07, in compliance with the specifications"
            + " of the Java Community Process, Sun relic"
            + "ensed most of its Java technologies under"
            + " the GNU General Public License. Others h"
            + "ave also developed alternative implementa"
            + "tions of these Sun technologies, such as "
            + "the GNU Compiler for Java (bytecode compi"
            + "ler), GNU Classpath (standard libraries),"
            + " and IcedTea-Web (browser plugin for appl"
            + "ets).";
    System.out.println("Text:\n"+text+"\n--------------------\nWords: "+countWords(text) + "\nSentecens: " + countSentecens(text)
            + "\nVowels: " + countVowels(text) + "\nChars: "
            + text.toCharArray().length + "\nUpper cases: "
            + countUpperCases(text)+"\nYour counter of words: "+yourCounter(text));
}


public static int yourCounter(String myFile) {
    if(myFile.length()==0)return 0;
    String WORD_DELIMETERS = ".:;?!,' ]";
    int wordCounter= WORD_DELIMETERS.contains(myFile.charAt(0)+"")?0:1;
    int last_Index=0;
    for(int i=0;i<myFile.length()-1;i++){
         for(int j=0;j<WORD_DELIMETERS.length()-1;j++){
            if(myFile.charAt(i)==WORD_DELIMETERS.charAt(j)){
                    if(myFile.charAt(i+1) !=' '){
                        if(last_Index!=i-1){
                            wordCounter++;
                        }
                        last_Index=i;

                    }
                } 
           } 
    }
    return wordCounter;
}

public static int countUpperCases(String text) {
    int upper = 0;
    char[] compare1 = text.toCharArray();
    char[] compare2 = text.toUpperCase().toCharArray();
    for (int i = 0; i < compare1.length; i++) {
        if (compare1[i] != compare2[i])
            upper++;
    }
    return upper;
}

public static int countWords(String text) {
    String pattern = "[.|:|;|?|!|,|'| ]";
    int counter = 0;
    for (String word : text.split(pattern)) {
        if (word.length() != 0)
            counter++;
    }
    return counter;
}

public static int countSentecens(String text) {
    String pattern = "[.|?|!]";
    int counter = 0;
    for (String word : text.split(pattern)) {
        if (word.length() != 0)
            counter++;
    }
    return counter;

}

public static int countVowels(String text) {
    int vowels = 0;
    for (char c : text.toCharArray()) {
        switch (c) {
        case 'a':
            vowels++;
        case 'e':
            vowels++;
        case 'i':
            vowels++;
        case 'o':
            vowels++;
        case 'u':
            vowels++;
        }
    }
    return vowels;
}

此返回:

文本:

Java是一种计算机编程语言,它是并发的,基于类的,面向对象的,并且专门设计为具有尽可能少的实现依赖性。它旨在让应用程序开发人员一次编写,随处运行&#34; (WORA),意味着在一个平台上运行的代码不需要重新编译以在另一个平台上运行。 Java应用程序通常编译为字节码(类文件),无论计算机体系结构如何,它都可以在任何Java虚拟机(JVM)上运行。截至2012年,Java是最受欢迎的编程语言之一,特别是对于客户端 - 服务器Web应用程序,据报道有900万开发人员。[10] [11] Java最初是由Sun Microsystems的James Gosling开发的(后来又合并到了Oracle Corporation),并于1995年作为Sun Microsystems的核心组件发布。 Java平台。该语言的大部分语法来自C和C ++,但它的低级设施比其中任何一种都少。原始和参考实现Java编译器,虚拟机和类库是由Sun于1991年开发的,并于1995年首次发布。截至2007年5月,根据Java Community Process的规范,Sun重新授权了大部分Java技术。 GNU通用公共许可证。其他人也开发了这些Sun技术的替代实现,例如GNU Compiler for Java(字节码编译器),GNU Classpath(标准库)和IcedTea-Web(applet的浏览器插件)。


单词:230

Sentecens:9

元音:1597

Chars:1516

大写:1154

你的言辞:230

答案 1 :(得分:0)

您只需要一行:

int words = myFile.split("[" + WORD_DELIMETERS + "]+").length;

这使用正则表达式将输入几乎分成单词,然后使用数组的长度对其进行计数。通过在字符类后添加加号将多个连续分隔符视为单个分隔符,因此“一个!!!!两个!!!”算作两个字。