我必须检查File1中的单词是否存在于File2中,然后计数。两个文件中的数据如下所示。
File1中的单词如下所示:
File2中的数据如下所示:
我写的代码如下:
File file1 = new File("ChineseWord.txt");
Scanner sc = new Scanner(new FileInputStream(file1));
ArrayList<String> list = new ArrayList<String>();
ArrayList<String> newList = new ArrayList<String>();
while(sc.hasNext()){
list.add(sc.next());
}
sc.close();
File file2 = new File("RandomData.txt");
Scanner newScanner = new Scanner(new FileInputStream(file2));
int count = 0;
for (int i = 0; i < list.size(); i++) {
while(newScanner.hasNext()){
String word = newScanner.nextLine();
String toMatch = list.get(i);
if(word.contains(toMatch)){
System.out.println("Success");
count++;
}
}
String test = list.get(i);
newList.add(test+"exists" + count+ "times");
count =0;
}
问题是它为所有单词返回0,而File1中的第一个单词存在于File2的第一行。如果我手动做这样的事情
if(word.contains("发表")){
System.out.println("Success");
count++;
}
它打印成功,否则它不会?为什么会这样?
答案 0 :(得分:2)
问题出在您的逻辑范围内,因为您循环遍历list
中的每个字词,但“File2”上的扫描程序仅在此list
循环之外创建 - 循环。
您可能应该将列表循环移到if (word.contains(toMatch))
上。
根据你的评论,我做了一个快速测试:
package so36862093;
import com.google.common.io.Resources;
import java.io.File;
import java.io.FileInputStream;
import java.nio.file.Files;
import java.util.*;
public class App {
public static void main(final String[] args) throws Exception {
final File file1 = new File(Resources.getResource("so36862093/ChineseWord.txt").toURI());
final List<String> list = Files.readAllLines(file1.toPath());
final File file2 = new File(Resources.getResource("so36862093/RandomData.txt").toURI());
final Scanner newScanner = new Scanner(new FileInputStream(file2));
final Map<String, Integer> count = new HashMap<>();
while(newScanner.hasNext()){
final String word = newScanner.nextLine();
for (String toMatch : list) {
if(word.contains(toMatch)){
System.out.println("Success");
count.put(toMatch, count.getOrDefault(toMatch, 0) + 1);
}
}
}
for (Map.Entry<String, Integer> e : count.entrySet()) {
System.out.println(e.getKey() + " exists " + e.getValue() + " times.");
}
}
}
和ChineseText.txt
(UTF-8)
发表
发愁
发达
发抖
发挥
和RandomData.txt
(UTF-8):
输出
Success
发表 exists 1 times.
跟进:我与你分享的项目玩了一点,问题是你在每一行的开头有一个不间断的空间U+65279(我没有)。
所以,你可能应该"strip"之前的那个角色。
答案 1 :(得分:2)
现在你正在读取整个文件,然后将它与列表中的第一个元素进行比较,它应该是相反的方式,从file2读取第一行并将其与整个列表进行比较。
将您的代码更改为 - &gt;
while(newScanner.hasNext()){
String word = newScanner.nextLine();
for (int i = 0; i < list.size(); i++) {
String toMatch = list.get(i);
if(word.contains(toMatch)){
System.out.println("Success");
count++;
}
}
}
答案 2 :(得分:0)
我认为你的问题在于编码:
Scanner newScanner = new Scanner(new FileInputStream(file2),"UNICODE");
试试:
File file1 = new File("data/ChineseWord.txt");
Scanner sc = new Scanner(new FileInputStream(file1),"UNICODE");
ArrayList<String> list = new ArrayList<String>();
ArrayList<String> newList = new ArrayList<String>();
while(sc.hasNext()){
list.add(sc.next());
}
sc.close();
File file2 = new File("data/RandomData.txt");
Scanner newScanner = new Scanner(new FileInputStream(file2),"UNICODE");
int count = 0;
for (int i = 0; i < list.size(); i++) {
while(newScanner.hasNext()){
String word = newScanner.nextLine();
String toMatch = list.get(i);
if(word.contains(toMatch)){
System.out.println("Success");
count++;
}
}
String test = list.get(i);
newList.add(test+"exists" + count+ "times");
count =0;
}