文件搜索高性能程序

时间:2012-08-06 05:39:19

标签: java multithreading io

我有一个要求是我必须编写一个高性能文件搜索程序,该程序应该列出所有文件和文件夹,它们与从最顶层文件夹开始提供的名称模式匹配,并在子文件夹中递归搜索。

程序可以是命令行主类,具有以下输入

开始搜索的顶级文件夹。示例是C:\ MyFolders 要搜索的项目类型。文件或文件夹或两者 搜索模式java正则表达式(java.util.regex)被接受为paatern

示例 MFile .tx?会找到UMFile123.txt和AIIMFile.txs' 应用程序必须返回的超时(以秒为单位)。否则必须以“无法完成操作”消息返回。

我想出了一个很好的方法..

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.List;

import com.sapient.test.fileSearch.FileSearch;

public class FilesearchMain {

    /**
     * @param args
     */
    public static void main(String[] args) {
        // TODO Auto-generated method stub
        int flag=0;
        System.out.println("Type Item to Search ");
        System.out.println("1 File");
        System.out.println("2 Folder ");
        System.out.println("3 Both");
        System.out.println("0 Exit");

        try{
        BufferedReader readType = new BufferedReader(new InputStreamReader(System.in));


        String searchType =readType.readLine();;

        System.out.println("Enter name of file to search ::");

        BufferedReader readName = new BufferedReader(new InputStreamReader(System.in));
        String fileName=readName.readLine();


        if(searchType==null && fileName==null){
            throw new Exception("Error Occured::Provide both the input Parameters");
        }
        validateInputs(searchType,fileName);
        FileSearch fileSearch = new FileSearch(searchType,fileName);
List resultList=fileSearch.findFiles();
        System.out.println(resultList);
        }catch(IOException io){
            System.out.println("Error Occured:: Check the input Parameters and try again");
        }catch(Exception e){
            System.out.println(e.getMessage());
        }
    }

    private static void validateInputs(String searchType, String fileName) 
    throws Exception{
        if(!(searchType.equals("1") || searchType.equals("2") || searchType.equals("3")) ){
            throw new Exception("Error:: Item to search can be only 1 or 2 or 3");
        }
        if(searchType.equals("") || fileName.equals("")){
            System.out.println("Error Occured:: Check the input Parameters and try again");

        }

    }

}

另一个文件是......

public class FileSearch {
    private String searchType;
    private String fileName;

    public FileSearch(){

    }
    public FileSearch(String sType,String fName){
        this.searchType=sType;
        this.fileName=fName;
    }

    public String getSearchType() {
        return searchType;
    }
    public void setSearchType(String searchType) {
        this.searchType = searchType;
    }
    public String getFileName() {
        return fileName;
    }
    public void setFileName(String fileName) {
        this.fileName = fileName;
    }
    public List findFiles(){

        File file = new File("C:\\MyFolders");

        return searchInDirectory(file);

    }
    //Assuming that files to search should contain the typed name by the user
    //
    private List searchInDirectory(File dirName){
        List<String> filesList = new ArrayList<String>();
        if(dirName.isDirectory()){
            File [] listFiles = dirName.listFiles();
            for(File searchedFile : listFiles){
                if(searchedFile.isFile() && searchedFile.getName().toUpperCase().contains(getFileName().toUpperCase())&& 
                        (getSearchType().equals("1") || getSearchType().equals("3") ) ){
                    filesList.add(searchedFile.getName());
                }else if(searchedFile.isDirectory() && searchedFile.getName().toUpperCase().contains(getFileName().toUpperCase())
                    &&  (getSearchType().equals("2") || getSearchType().equals("3") ) ){
                    filesList.add(searchedFile.getName());
                    searchInDirectory(searchedFile);
                }else{
                    searchInDirectory(searchedFile);
                }
            }
        }
        return filesList;
    }

}


Please advise is this approach is correct as per design..!

1 个答案:

答案 0 :(得分:4)

 if (topFolderOrFile.isDirectory()) {
      File[] subFoldersAndFileNames = topFolderOrFile.listFiles(fileFilter);

fileFilter看起来像这样

public class MyFileFilter implements FileFilter{

    public boolean accept(File pathname) {
        return fileNamePattern.matcher(pathname.getName()).find();
    }

}

这基本上保证了返回的文件列表符合FileFilter

中的条件

现在,在这种情况下它是语义,因为为了使listFiles方法起作用,它仍然需要迭代所有文件。

您可以尝试维护过滤器的单个实例,而不是在每次迭代时重新创建它,但是您需要分析算法与这可能带来的任何好处之间的差异。

另外,您可以部署某种Thread队列,其中每个线程负责检查给定目录的匹配并排队任何新的子目录。只是一个想法

可重复使用的模式

public static void searchFile(String topFolderName, String type,
        String fileNamePatternRegExp, long timeOut) throws IOException {

    long startTimeStamp = Calendar.getInstance().getTimeInMillis();

    File topFolderOrFile = new File(topFolderName);
    Pattern fileNamePattern = Pattern.compile(fileNamePatternRegExp);

    searchFile(topFolderName, type, fileNamePattern, long timeOut);

}

public static void searchFile(String topFolderName, String type,
        Pattern fileNamePattern, long timeOut) throws IOException {
    //...
}

这些是我所做的基本改变,但实际上,你必须决定它们是否有效。

public static class PatternFileFilter implements FileFilter {

    private Pattern fileNamePattern;

    public PatternFileFilter(Pattern fileNamePattern) {

        this.fileNamePattern = fileNamePattern;

    }

    @Override
    public boolean accept(File pathname) {

        return fileNamePattern.matcher(pathname.getName()).find() || pathname.isDirectory();

    }

    public Pattern getPattern() {
        return fileNamePattern;
    }
}

public static void searchFile(File topFolderOrFile, String type, PatternFileFilter filter, long timeOut) throws IOException {

    long startTimeStamp = Calendar.getInstance().getTimeInMillis();

    if (topFolderOrFile.isDirectory()) {

        File[] subFoldersAndFileNames = topFolderOrFile.listFiles(filter);
        if (subFoldersAndFileNames != null && subFoldersAndFileNames.length > 0) {
            for (File subFolderOrFile : subFoldersAndFileNames) {

                if (ITEM_TYPE_FILE.equals(type) && subFolderOrFile.isFile()) {
                    System.out.println("File name matched ----- "
                            + subFolderOrFile.getName());
                }
                if (ITEM_TYPE_FOLDER.equals(type)
                        && subFolderOrFile.isDirectory() && filter.getPattern().matcher(subFolderOrFile.getName()).find()) {
                    System.out.println("Folder name matched ----- "
                            + subFolderOrFile.getName());
                }
                if (ITEM_TYPE_FILE_AND_FOLDER.equals(type) && filter.getPattern().matcher(subFolderOrFile.getName()).find()) {
                    System.out.println("File or Folder name matched ----- "
                            + subFolderOrFile.getName());
                }

                // You need to decide if you want to process the folders inline
                // or after you've processed the file list...

                if (subFolderOrFile.isDirectory()) {
                    long timeElapsed = startTimeStamp
                            - Calendar.getInstance().getTimeInMillis();
                    if (((timeOut * 1000) - timeElapsed) < 0) {
                        System.out
                                .println("Could not complete operation-- timeout");
                    } else {
                        searchFile(subFolderOrFile, type,
                                filter, (timeOut * 1000)
                                - timeElapsed);
                    }
                }
            }

        }

    }

}

public static void searchFile(String topFolderName, String type, String fileNamePatternRegExp, long timeOut) throws IOException {

    File topFolderOrFile = new File(topFolderName);
    Pattern fileNamePattern = Pattern.compile(fileNamePatternRegExp);

    searchFile(topFolderOrFile, type, new PatternFileFilter(fileNamePattern), timeOut);

}

我只想说,这是一条鱼,现在你需要学习捕鱼;)