如何删除名称相同但扩展名不同的重复文件?

时间:2018-07-04 09:51:39

标签: java duplicates

我的目录中有大量图像。某些图片的问题是它们具有相同名称但扩展名不同的重复图片,例如image1.jpg,image1.jpeg,image1.png,它们都是相同的图像,名称相同,但扩展名不同。如何使用Java查找和删除这些重复项?有很多用于查找重复项的工具,但是我找不到针对此特定问题的任何工具或脚本。任何帮助将不胜感激。

3 个答案:

答案 0 :(得分:1)

将所有文件读入某种List

List<File> filesInFolder = Files.walk(Paths.get("\\path\\to\\folder"))
        .filter(Files::isRegularFile)
        .map(Path::toFile)
        .collect(Collectors.toList());

然后循环浏览它们,如果文件未以所需的扩展名结尾,则将其删除:

filesInFolder.stream().filter((file) -> (!file.toString().endsWith(".jpg"))).forEach((file) -> {
    file.delete();
});

您可以根据自己的特定需求进行调整。

答案 1 :(得分:1)

实现这一目标的唯一方法,恕我直言,是创建一个助手类:

    public class FileUtil {
    String fileName;
    File file;
    boolean delete = true;


    public FileUtil(String fileName, File file) {
        super();
        this.fileName = fileName.substring(0, fileName.indexOf("."));
        this.file = file;
    }

    public String getFileName() {
        return fileName;
    }
    public void setFileName(String fileName) {
        this.fileName = fileName;
    }
    public File getFile() {
        return file;
    }
    public void setFile(File file) {
        this.file = file;
    }
    public boolean isDelete() {
        return delete;
    }
    public void setDelete(boolean delete) {
        this.delete = delete;
    }

    @Override
    public String toString() {
        return "FileUtil [fileName=" + fileName + ", file=" + file + ", delete=" + delete + "]";
    }

}

然后您可以使用它来收集和删除您的物品:

try (Stream<Path> paths = Files.walk(Paths.get("c:/yourPath/"))) {
        List<FileUtil> listUtil = new ArrayList<FileUtil>();

        paths
            .filter(Files::isRegularFile)
            .map(filePath -> filePath.toFile())
            .collect(Collectors.toList())
            .forEach(file -> listUtil.add(new FileUtil(file.getName(), file)));

        Map<String, List<FileUtil>> collect = listUtil.stream()
                .collect(Collectors.groupingBy(FileUtil::getFileName));

        for(String key : collect.keySet() ) {
            List<FileUtil> list = collect.get(key);
            if(list.size() > 1) {
                list.stream().findFirst().ifPresent(f -> f.setDelete(false));

                list.stream()
                    .filter(fileUtil -> fileUtil.isDelete())
                    .forEach(fileUtil -> fileUtil.getFile().delete());
            }
        }


    } catch (IOException e) {
        e.printStackTrace();
    } 

这样,我将保留一个随机项,如果您愿意,可以修改该类以仅保留所需的扩展名,例如.png

我希望这会有所帮助:)

答案 2 :(得分:1)

这里是MCVE

此示例实现了Set,仅提供包含图像的文件夹/目录的路径即可自动删除重复的图像(只是一个不同的主意,以显示其他可用选项以及如何使用Java的OO功能

import java.io.File;
import java.util.HashSet;
import java.util.Set;

public class DuplicateRemover {

    // inner class to represent an image
    class Image{
        String path; // the absolute path of image file as a String

        // constructor
        public Image(String path) {
            this.path = path;
        }       

        @Override
        public boolean equals(Object o) {
            if(o instanceof Image){
                // if both base names are equal -> delete the old one
                if(getBaseName(this.path).equals(getBaseName(((Image)o).path))){
                    File file = new File(this.path);
                    return file.delete();
                }
            }
            return false;
        }

        @Override
        public int hashCode() {
            return 0; // in this case, only "equals()" method is considered for duplicate check
         } 

         /**
          * This method to get the Base name of the image from the path
          * @param fileName
          * @return
          */
        private String getBaseName(String fileName) {
            int index = fileName.lastIndexOf('.'); 
            if (index == -1) { return fileName; } 
            else { return fileName.substring(0, index); }
         }
    }


    Set<Image> images; // a set of image files

    //constructor
    public DuplicateRemover(){
        images = new HashSet<>();
    } 

    /**
     * Get the all the images from the given folder
     * and loop through all files to add them to the images set
     * @param dirPath
     */
    public void run(String dirPath){
        File dir = new File(dirPath);
        File[] listOfImages = dir.listFiles(); 
        for (File f : listOfImages){
            if (f.isFile()) { 
                images.add(new Image(f.getAbsolutePath()));
            }
        }
    }


    //TEST
    public static void main(String[] args) {
        String dirPath = "C:\\Users\\Yahya Almardeny\\Desktop\\folder";
        /* dir contains: {image1.png, image1.jpeg, image1.jpg, image2.png}       */
        DuplicateRemover dr = new DuplicateRemover();
        // the images set will delete any duplicate image from the folder
        // according to the logic we provided in the "equals()" method
        dr.run(dirPath); 

        // print what images left in the folder
        for(Image image : dr.images) {
            System.out.println(image.path);
        }

        //Note that you can use the set for further manipulation if you have in later
    }

}

结果

C:\Users\Yahya Almardeny\Desktop\folder\image1.jpeg
C:\Users\Yahya Almardeny\Desktop\folder\image2.png