Question

我必须检查File1中的单词是否存在于File2中，然后计数。两个文件中的数据如下所示。

File1中的单词如下所示：

发表
发愁
发达
发抖
发挥

File2中的数据如下所示：

这篇论文是什么时候发表的？
91。数据删掉被马工程师了
92。驾驶酒后很大危害
93。客观地要他人评价
94。我不小心水壶打翻了把

我写的代码如下：

File file1 = new File("ChineseWord.txt");
        Scanner sc = new Scanner(new FileInputStream(file1));
        ArrayList<String> list = new ArrayList<String>();
        ArrayList<String> newList = new ArrayList<String>();

        while(sc.hasNext()){
                list.add(sc.next());
        }

        sc.close();

        File file2 = new File("RandomData.txt");

        Scanner newScanner = new Scanner(new FileInputStream(file2));

        int count = 0;

        for (int i = 0; i < list.size(); i++) {

            while(newScanner.hasNext()){

                String word = newScanner.nextLine();
                String toMatch = list.get(i);

                if(word.contains(toMatch)){
                    System.out.println("Success");
                    count++;
                }


            }

            String test = list.get(i);
            newList.add(test+"exists" + count+ "times");
            count =0;

        }

问题是它为所有单词返回0，而File1中的第一个单词存在于File2的第一行。如果我手动做这样的事情

if(word.contains("发表")){
                        System.out.println("Success");
                        count++;
                    }

它打印成功，否则它不会？为什么会这样？

Answer 1

问题出在您的逻辑范围内，因为您循环遍历list中的每个字词，但“File2”上的扫描程序仅在此list循环之外创建 - 循环。

您可能应该将列表循环移到if (word.contains(toMatch))上。

根据你的评论，我做了一个快速测试：

package so36862093;

import com.google.common.io.Resources;

import java.io.File;
import java.io.FileInputStream;
import java.nio.file.Files;
import java.util.*;

public class App {
    public static void main(final String[] args) throws Exception {
        final File file1 = new File(Resources.getResource("so36862093/ChineseWord.txt").toURI());
        final List<String> list = Files.readAllLines(file1.toPath());
        final File file2 = new File(Resources.getResource("so36862093/RandomData.txt").toURI());
        final Scanner newScanner = new Scanner(new FileInputStream(file2));
        final Map<String, Integer> count = new HashMap<>();

        while(newScanner.hasNext()){
            final String word = newScanner.nextLine();

            for (String toMatch : list) {
                if(word.contains(toMatch)){
                    System.out.println("Success");
                    count.put(toMatch, count.getOrDefault(toMatch, 0) + 1);
                }
            }
        }

        for (Map.Entry<String, Integer> e : count.entrySet()) {
            System.out.println(e.getKey() + " exists " + e.getValue() + " times.");
        }
    }
}

和ChineseText.txt（UTF-8）

发表
发愁
发达
发抖
发挥

和RandomData.txt（UTF-8）：

输出

Success
发表 exists 1 times.

跟进：我与你分享的项目玩了一点，问题是你在每一行的开头有一个不间断的空间U+65279（我没有）。

<强>插图：

所以，你可能应该"strip"之前的那个角色。

Answer 2

现在你正在读取整个文件，然后将它与列表中的第一个元素进行比较，它应该是相反的方式，从file2读取第一行并将其与整个列表进行比较。

将您的代码更改为 - ＆gt;

while(newScanner.hasNext()){
    String word = newScanner.nextLine();
    for (int i = 0; i < list.size(); i++) {
        String toMatch = list.get(i);

        if(word.contains(toMatch)){
            System.out.println("Success");
            count++;
        }
    }
}

Answer 3

我认为你的问题在于编码：

 Scanner newScanner = new Scanner(new FileInputStream(file2),"UNICODE");

试试：

    File file1 = new File("data/ChineseWord.txt");
    Scanner sc = new Scanner(new FileInputStream(file1),"UNICODE");
    ArrayList<String> list = new ArrayList<String>();
    ArrayList<String> newList = new ArrayList<String>();

    while(sc.hasNext()){
            list.add(sc.next());
    }

    sc.close();

    File file2 = new File("data/RandomData.txt");
    Scanner newScanner = new Scanner(new FileInputStream(file2),"UNICODE");

    int count = 0;

    for (int i = 0; i < list.size(); i++) {

        while(newScanner.hasNext()){

            String word = newScanner.nextLine();
            String toMatch = list.get(i);

            if(word.contains(toMatch)){
                System.out.println("Success");
                count++;
            }


        }

        String test = list.get(i);
        newList.add(test+"exists" + count+ "times");
        count =0;

    }

String.contains函数不起作用

3 个答案: