我有我的大学https://cs1331.gitlab.io/fall2018/hw2/hw2-source-model.html的作业。我写了代码,但是当我运行程序时,我在控制台上收到了以下消息:
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: begin 0, end -1, length 2
at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3107)
at java.base/java.lang.String.substring(String.java:1873)
at homework1.SourceModel.main(SourceModel.java:127)
这是我的带有注释的作业代码:
package homework1;
import java.util.Scanner;
import java.io.File;
import java.io.FileNotFoundException;
public class SourceModel {
//initialize variables so they can be accessed everywhere
private String modelName;
private int[][] characterCount;
private double[] rowCount;
private double[][] probability;
/**
*
* @param name takes the name of the corpus
* @param fileName takes the filesName of corpus
*/
public SourceModel(String name, String fileName) {
modelName = name;
characterCount = new int[26][26];
rowCount = new double[26];
probability = new double[26][26];
System.out.println("Training " + name + "model...");
try {
Scanner scan = new Scanner(new File(fileName));
String temp = "";
//append all of the text
while (scan.hasNext()) {
temp += scan.next();
}
//only keeps the letters and makes them lowercase
temp = temp.replaceAll("[^A-Za-z]+", "").toLowerCase();
System.out.println(temp);
//iterates trough each letter then puts the letters
//sequence to the respective row and column
for (int i = 0; i < (temp.length() - 1); i++) {
char firstLetter = temp.charAt(i);
char secondLetter = temp.charAt(i + 1);
//index based on ASCII values
characterCount[(int) firstLetter - 97][(int) secondLetter - 97]++;
rowCount[(int) firstLetter - 97]++;
}
//calculates the probability by dividing the count
//by the total counts in each row
for (int i = 0; i < probability.length; i++) {
for (int j = 0; j < probability[i].length; j++) {
if (rowCount[i] == 0) {
rowCount[i] = 0.01;
}
probability[i][j] = (((double) characterCount[i][j]) / rowCount[i]);
if (probability[i][j] == 0) {
probability[i][j] = 0.01;
}
}
}
System.out.println("done");
}
catch (FileNotFoundException e) {
e.printStackTrace();
}
}
/**
*
* @return a string which contains the name
*/
public String getName() {
return modelName;
}
/**
* @return a string with the matrix
*/
public String toString() {
String matrix = "";
matrix += "";
for (int i = 97; i < 123; i++) {
matrix += " ";
matrix += (char) i;
}
matrix += ("\n");
for (int i = 0; i < probability.length; i++) {
matrix += ((char) (i + 97) + " ");
for (int j = 0; j < probability[i].length; j++) {
matrix += String.format("%.2f", probability[i][j]);
matrix += ("");
}
matrix += "\n";
}
return matrix;
}
/**
*
* @param test a set of letters to test
* @return the probability for the word
*/
public double probability(String test) {
test = test.replaceAll("[^A-Za-z]+", "").toLowerCase();
double stringProbability = 1.0;
for (int i = 0; i < test.length() - 1; i++) {
int firstIndex = (int) (test.charAt(i)) - 97;
int secondIndex = (int) (test.charAt(i + 1)) - 97;
stringProbability *= probability[firstIndex][secondIndex];
}
return stringProbability;
}
/**
*
* @param args the command line arguments
*/
public static void main(String[] args) {
SourceModel[] models = new SourceModel[args.length - 1];
for (int i = 0; i < args.length - 1; i++) {
models[i] = new SourceModel(args[i].substring(0, args[i].indexOf(".")), args[i]);
}
System.out.println("Analyzing: " + args[args.length - 1]);
double[] normalizedProbability = new double[args.length - 1];
double sumProbability = 0;
for (int i = 0; i < args.length - 1; i++) {
sumProbability += models[i].probability(args[args.length - 1]);
}
//normalize the probability in respect to the values given
for (int i = 0; i < normalizedProbability.length; i++) {
normalizedProbability[i] = models[i].probability(args[args.length - 1]) / sumProbability;
}
int highestIndex = 0;
for (int i = 0; i < args.length - 1; i++) {
System.out.print("Probability that test string is");
System.out.printf("%9s: ", models[i].getName());
System.out.printf("%.2f", normalizedProbability[i]);
System.out.println("");
if (normalizedProbability[i] > normalizedProbability[highestIndex]) {
highestIndex = i;
}
}
System.out.println("Test string is most likely " + models[highestIndex].getName() + ".");
}
}
答案 0 :(得分:3)
其他人已经指出了这一点,但是对于这一行:
models[i] = new SourceModel(args[i].substring(0, args[i].indexOf(".")), args[i]);
substring
方法显然引起了问题,因为如果找不到indexOf
,.
返回-1。
但是,在这种情况下,代码实际上不是问题,因为分配表明您可以假定文件名的格式为<source-name>.corpus
。话虽这么说,所有命令行参数中都应该包含.
,所以这应该不会发生。
我将检查您要传递的命令行参数。我的一个猜测是,您可能在文件名中带有空格或其他内容。例如,如果您传递了English GB.corpus
,那么它将显示为2个独立的参数(其中一个没有.
)。
编辑:正如@Pshemo在注释中指出的那样,如果文件名中带有空格,则可以将其放在引号中,以便将其解释为单个命令行参数-例如,写English GB.corpus
代替"English GB.corpus"
。这样可以防止异常。
答案 1 :(得分:0)
在你的主要方法,您有:
args[i].indexOf(".")
找不到点(。),因此它返回-1。
您尝试创建一个字符串:
models[i] = new SourceModel(args[i].substring(0, args[i].indexOf(".")), args[i]);
但是由于args[i].indexOf(".")
无效,因此会引发异常。
您可以做的是检查点(。)是否存在,如果是,请继续:
if(args[i].contains(".")){
models[i] = new SourceModel(args[i].substring(0, args[i].indexOf(".")), args[i]);
}