我正在尝试创建一个从文件中读取数据的程序。我希望每次都检查文件中的下一个单词是否与特定String数组中的特定单词匹配。
每次单词不匹配时,我想跟踪时间(错误++)并打印文件中的单词与字符串数组中至少一个单词不匹配的次数。
这是我的计划:
public class main_class {
public static int num_wrong;
public static java.io.File file = new java.io.File("text.txt");
public static String[] valid_letters = new String[130];
public static boolean wrong = true;
public static String[] sample = new String[190];
public static void text_file() throws Exception {
// Create Scanner to read file
Scanner input = new Scanner(file);
String[] valid_letters = { "I", " have ", " got ", "a", "date", "at",
"quarter", "to", "eight", "8", "7:45", "I’ll", "see", "you",
"the", "gate", ",", "so", "don’t", "be", "late", "We",
"surely", "shall", "sun", "shine", "soon", "would", "like",
"sit", "here", "cannot", "hear", "because", "of", "wood",
"band", "played", "its", "songs", "banned", "glamorous",
"night", "sketched", "a", "drone", "flying", "freaked", "out",
"when", "saw", "squirrel", "swimming", "man", "had", "cat",
" that", "was", "eating", "bug", "After", "dog", "got", "wet",
"Ed", "buy", "new", "pet", "My", "mom", "always", "tells",
"me", "beautiful", "eyes", "first", "went", "school", "wanted",
"die" };
while (input.hasNext()) {
String[] sample = input.next().split("\t");
for (int i = 0; i < valid_letters.length; i++) {
for (int j = 0; j < 1; j++) {
if (sample[j] == valid_letters[i]) {
boolean wrong = false;
System.out.print("break");
break;
}
}
}
if (wrong = true) {
num_wrong++;
}
}
// print out the results from the search
System.out
.print(" The number of wrong words in the first 13 sentences are "
+ num_wrong);
// Close the file
input.close();
}
}
例如,文本文件包含:
I want to go to school little monkey
该程序应该返回错误的数量 2
。
答案 0 :(得分:2)
<强>代码:强>
import java.util.Scanner;
public class main_class {
public static int num_wrong = 0;
public static java.io.File file = new java.io.File("text.txt");
public static String[] valid_letters = new String[130];
public static boolean wrong = true;
public static String[] sample = new String[190];
public static void main (String [] args) {
try {
text_file();
} catch (Exception e) {
e.printStackTrace();
}
}
public static void text_file() throws Exception {
// Create Scanner to read file
Scanner input = new Scanner(file);
String [] valid_letters = { "I", " have ", " got ", "a", "date", "at",
"quarter", "to", "eight", "8", "7:45", "I’ll", "see", "you",
"the", "gate", ",", "so", "don’t", "be", "late", "We",
"surely", "shall", "sun", "shine", "soon", "would", "like",
"sit", "here", "cannot", "hear", "because", "of", "wood",
"band", "played", "its", "songs", "banned", "glamorous",
"night", "sketched", "a", "drone", "flying", "freaked", "out",
"when", "saw", "squirrel", "swimming", "man", "had", "cat",
" that", "was", "eating", "bug", "After", "dog", "got", "wet",
"Ed", "buy", "new", "pet", "My", "mom", "always", "tells",
"me", "beautiful", "eyes", "first", "went", "school", "wanted",
"die" };
while (input.hasNext()) {
// NOTE: split using space, i.e. " "
String[] sample = input.next().split(" ");
// NOTE: j < sample.length
for (int j = 0; j < sample.length; j++)
{
for (int i = 0; i < valid_letters.length; i++)
{
// NOTE: string comparison is using equals
if (sample[j].equals(valid_letters[i])) {
// NOTE: You want to update the variable wrong.
// And not create a local variable 'wrong' here!
wrong = false;
System.out.printf("%-12s is inside!%n",
"'" + valid_letters[i] + "'");
break;
}
}
if (wrong) {
num_wrong++;
}
// Reset wrong
wrong = true;
}
}
// Print out the results from the search
System.out.println("The number of wrong words in the first 13 sentences are "
+ num_wrong);
// Close the file
input.close();
}
}
输入(存储在“text.txt”中):
I want to go to school little monkey
<强>输出:强>
'I' is inside!
'to' is inside!
'to' is inside!
'school' is inside!
The number of wrong words in the first 13 sentences are 4
//'go', 'want', 'little' and 'monkey' are not inside the String array
注意:强>
Value
字符串的比较是使用equals
,而不是==
(用于Reference
比较)boolean wrong = false;
创建一个本地变量j < sample.length
" "
(空格)拆分字符串,而不是"\t"
(标签)答案 1 :(得分:2)
如果你想快速做到这一点,你可以动态制作一个三元树或哈希 如果你希望单词列表改变。
如果单词列表没有改变,您可以避免必须拆分单词并将三元树变成完整的正则表达式。然后执行查找全部以获取列表中不包含的所有单词。
这种正则表达式是一种非常快的方式。
您可以使用此试用版应用程序从单词列表中自动生成正则表达式
regexformat.com 。
将其设置为不区分大小写和空白字边界。
将输出组调整为负向前瞻,如下所示。
# "(?i)(?<!\\S)(?!(?:,|7:45|8|a(?:fter|lways|t)?|b(?:an(?:d|ned)|e(?:autiful|cause)?|u(?:g|y))|ca(?:nnot|t)|d(?:ate|ie|o(?:g|n’t)|rone)|e(?:ating|d|ight|yes)|f(?:irst|lying|reaked)|g(?:ate|lamorous|ot)|h(?:a(?:d|ve)|e(?:ar|re))|i(?:ts|’ll)?|l(?:ate|ike)|m(?:an|e|om|y)|n(?:ew|ight)|o(?:f|ut)|p(?:et|layed)|quarter|s(?:aw|chool|ee|h(?:all|ine)|it|ketched|o(?:ngs|on)?|quirrel|u(?:n|rely)|wimming)|t(?:ells|h(?:at|e)|o)|w(?:a(?:nted|s)|e(?:nt|t)?|hen|o(?:od|uld))|you)(?!\\S))\\S+(?!\\S)"
(?i)
(?<! \S )
(?!
(?:
,
| 7:45
| 8
| a
(?: fter | lways | t )?
| b
(?:
an
(?: d | ned )
| e
(?: autiful | cause )?
| u
(?: g | y )
)
| ca
(?: nnot | t )
| d
(?:
ate
| ie
| o
(?: g | n’t )
| rone
)
| e
(?: ating | d | ight | yes )
| f
(?: irst | lying | reaked )
| g
(?: ate | lamorous | ot )
| h
(?:
a
(?: d | ve )
| e
(?: ar | re )
)
| i
(?: ts | ’ll )?
| l
(?: ate | ike )
| m
(?: an | e | om | y )
| n
(?: ew | ight )
| o
(?: f | ut )
| p
(?: et | layed )
| quarter
| s
(?:
aw
| chool
| ee
| h
(?: all | ine )
| it
| ketched
| o
(?: ngs | on )?
| quirrel
| u
(?: n | rely )
| wimming
)
| t
(?:
ells
| h
(?: at | e )
| o
)
| w
(?:
a
(?: nted | s )
| e
(?: nt | t )?
| hen
| o
(?: od | uld )
)
| you
)
(?! \S )
)
\S+
(?! \S )
答案 2 :(得分:0)
我可以看到两个错误。
if(wrong = true)
我认为你的意思是
if(wrong == true)
另外,这个循环:
for (int j= 0 ; j < 1 ; j ++ )
只会在j = 0时执行,因为它会在之后停止。我想你的意思是
for (int j= 0 ; j < sample[i].length ; j ++ )
答案 3 :(得分:0)
使用像List这样的集合,这样每次都可以只执行list.contains(“word”)
而不是遍历字符串数组。List<String> validWords = new ArrayList<String>
validWords.add("I");
validWords.add("More words ..");
int wrongCount = 0;
while (input.hasNext())
{
String [] sample = input.next().split("\t");
for ( int i = 0 ; i < sample.length ; i++)
{
if (!validWords.contains(sample[i]))
{
wrongCount ++ ;
}
}
}
System.out.print(" The number of wrong words" + wrongCount ) ;
// .....
注意:在方法级别重新声明全局变量是错误的。 valid_letters,sample。您可以在main方法中声明它们时初始化它们,而不必指定静态数组大小。
第二,一旦找到第一个匹配的单词,你的代码就会从循环中断,这样你就可以避免检查剩下的单词,这是你想要做的吗?如果是这样,你可以相应地编辑我的代码