Question

与文件2相比，我正在尝试从文件1中删除不存在的行

示例：

输入

文件1

text
example
word

文件2

example
word

输出

文件1

example
word

我的代码完全相反：它消除了2个文件中所有重复的单词。

我的实际输出是：

文件1

text

代码

BufferedReader reader2 = new BufferedReader(new FileReader(file2));
Set<String> lines2 = new HashSet<String>(10000);
String line2;
while ((line2 = reader.readLine()) != null) {
    lines2.add(line);
}
BufferedReader reader = new BufferedReader(new FileReader(file1));
Set<String> lines = new HashSet<String>(10000);
String line;
while ((line = reader.readLine()) != null) {
    lines.add(line);
}
Set set3 = new HashSet(lines);  
set3.removeAll(lines2);

Answer 1

您需要两个集合之间的交集。现在，您正在计算集合之间的对称差异。

 public static void main(String []args){

    Set<String> file1 = new HashSet<>();
    Set<String> file2 = new HashSet<>();

    file1.add("text");
    file1.add("example");
    file1.add("word");

    file2.add("example");
    file2.add("word");

    Set<String> intersection = new HashSet<>(file1);
    intersection.retainAll(file2);

    System.out.println(intersection);
 }

输出：

[word, example]

Answer 2

好吧，您几乎可以使用您的方法了，您所缺少的只是调用时的另一行代码

lines.removeAll(set3);

然后您将获得所需结果的集合（行）。

Answer 3

在您的原始代码中，您先读入文件2，然后读入文件1，只是从file1中删除了file2中的单词，剩下一个不同的单词。在这里，我写出了代码，并进行了注释。您需要一个集合，然后将其从完整列表中删除。在我的代码中，我创建了一个新集合，以防万一您想要重建第一个集合并将其保留为未修改状态。

package scrapCompare;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;

public class CompareLines {

public static void main(String[] args) throws IOException {
    // TODO Auto-generated method stub

    //You create a set of words from file 1.
    BufferedReader reader = new BufferedReader(new FileReader("file1"));
    Set<String> lines = new HashSet<String>(10000);
    String line;
    while ((line = reader.readLine()) != null) {
        lines.add(line);
    }
    //You create a set of words from file 2.
    BufferedReader reader2 = new BufferedReader(new FileReader("file2"));
    Set<String> lines2 = new HashSet<String>(10000);
    String line2;
    while ((line2 = reader2.readLine()) != null) {
        lines2.add(line2);
    }

    //In your original code, you create a third set of words equal to file 1, and then delete all the words from file 2.
    //It isolates the one different word, but you stopped there.
    Set set3 = new HashSet(lines);  
    set3.removeAll(lines2);

    lines.removeAll(set3);
    //the answer set is made, in case you want to rebuild the lines set.
    Set <String> answer = lines;
    //iterator for printing to console.
    Iterator<String> itr = answer.iterator();
    //print the answer to console 
    while(itr.hasNext())
    System.out.println(itr.next());

    //close your readers
    reader.close();
    reader2.close();

}

}

Answer 4

public class RemoveLine {

    public static void main(String[] args) throws IOException {
        String file = "../file.txt";
        String file1 = "../file1.txt";
        String file2 = "../file2.txt";

        BufferedReader reader2 = new BufferedReader(new FileReader(file2));
        Set<String> lines2 = new HashSet<String>(10000);
        String line2;
        while ((line2 = reader2.readLine()) != null) {
            lines2.add(line2);
        }

        BufferedReader reader1 = new BufferedReader(new FileReader(file1));
        Set<String> lines1 = new HashSet<String>(10000);
        String line1;
        while ((line1 = reader1.readLine()) != null) {
            lines1.add(line1);
        }

        Set<String> outPut = lines1.stream().filter(l1 -> lines2.stream().anyMatch(l2 -> l2.equals(l1))).collect(Collectors.toSet());


        Charset utf8 = StandardCharsets.UTF_8;

        Files.write(Paths.get(file), outPut, utf8, StandardOpenOption.CREATE);

    }

}

如何比较2个文件并删除不存在的行？

4 个答案: