如何比较2个文件并删除不存在的行?

时间:2019-03-29 18:05:09

标签: java file compare hashset

与文件2相比,我正在尝试从文件1中删除不存在的行

示例:

输入

文件1

text
example
word

文件2

example
word

输出

文件1

example
word

我的代码完全相反:它消除了2个文件中所有重复的单词。

我的实际输出是:

文件1

text

代码

BufferedReader reader2 = new BufferedReader(new FileReader(file2));
Set<String> lines2 = new HashSet<String>(10000);
String line2;
while ((line2 = reader.readLine()) != null) {
    lines2.add(line);
}
BufferedReader reader = new BufferedReader(new FileReader(file1));
Set<String> lines = new HashSet<String>(10000);
String line;
while ((line = reader.readLine()) != null) {
    lines.add(line);
}
Set set3 = new HashSet(lines);  
set3.removeAll(lines2);

4 个答案:

答案 0 :(得分:0)

您需要两个集合之间的交集。现在,您正在计算集合之间的对称差异。

 public static void main(String []args){

    Set<String> file1 = new HashSet<>();
    Set<String> file2 = new HashSet<>();

    file1.add("text");
    file1.add("example");
    file1.add("word");

    file2.add("example");
    file2.add("word");

    Set<String> intersection = new HashSet<>(file1);
    intersection.retainAll(file2);

    System.out.println(intersection);
 }

输出:

[word, example]

答案 1 :(得分:0)

好吧,您几乎可以使用您的方法了,您所缺少的只是调用时的另一行代码

lines.removeAll(set3);

然后您将获得所需结果的集合(行)。

答案 2 :(得分:0)

在您的原始代码中,您先读入文件2,然后读入文件1,只是从file1中删除了file2中的单词,剩下一个不同的单词。 在这里,我写出了代码,并进行了注释。您需要一个集合,然后将其从完整列表中删除。 在我的代码中,我创建了一个新集合,以防万一您想要重建第一个集合并将其保留为未修改状态。

package scrapCompare;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;

public class CompareLines {

public static void main(String[] args) throws IOException {
    // TODO Auto-generated method stub

    //You create a set of words from file 1.
    BufferedReader reader = new BufferedReader(new FileReader("file1"));
    Set<String> lines = new HashSet<String>(10000);
    String line;
    while ((line = reader.readLine()) != null) {
        lines.add(line);
    }
    //You create a set of words from file 2.
    BufferedReader reader2 = new BufferedReader(new FileReader("file2"));
    Set<String> lines2 = new HashSet<String>(10000);
    String line2;
    while ((line2 = reader2.readLine()) != null) {
        lines2.add(line2);
    }

    //In your original code, you create a third set of words equal to file 1, and then delete all the words from file 2.
    //It isolates the one different word, but you stopped there.
    Set set3 = new HashSet(lines);  
    set3.removeAll(lines2);

    lines.removeAll(set3);
    //the answer set is made, in case you want to rebuild the lines set.
    Set <String> answer = lines;
    //iterator for printing to console.
    Iterator<String> itr = answer.iterator();
    //print the answer to console 
    while(itr.hasNext())
    System.out.println(itr.next());

    //close your readers
    reader.close();
    reader2.close();

}

}

答案 3 :(得分:0)

public class RemoveLine {

    public static void main(String[] args) throws IOException {
        String file = "../file.txt";
        String file1 = "../file1.txt";
        String file2 = "../file2.txt";

        BufferedReader reader2 = new BufferedReader(new FileReader(file2));
        Set<String> lines2 = new HashSet<String>(10000);
        String line2;
        while ((line2 = reader2.readLine()) != null) {
            lines2.add(line2);
        }

        BufferedReader reader1 = new BufferedReader(new FileReader(file1));
        Set<String> lines1 = new HashSet<String>(10000);
        String line1;
        while ((line1 = reader1.readLine()) != null) {
            lines1.add(line1);
        }

        Set<String> outPut = lines1.stream().filter(l1 -> lines2.stream().anyMatch(l2 -> l2.equals(l1))).collect(Collectors.toSet());


        Charset utf8 = StandardCharsets.UTF_8;

        Files.write(Paths.get(file), outPut, utf8, StandardOpenOption.CREATE);

    }

}