从路径文件到唯一数据结构

时间:2016-08-04 20:59:40

标签: java data-structures

我从文件中读取路径列表。我想将它们保存在内置的java结构中,可以自动删除重复项。重复,我的意思是,如果我有/usr/bin,然后我添加/usrbin文件夹必须被删除,因为"包含"在usr文件夹中。我按顺序读取文件,所以如果可能的话,我不想再检查所有数据两次。

示例代码:

UnknownType<Path> database;
BufferedReader reader = new BufferedReader(new FileReader(new File("db.txt")));

String line;
while ((line = reader.readLine()) != null) {
    Path path = Paths.get(line).toRealPath();
    database.add(path);
}

示例文件:

/usr/bin
/usr
/dev
/dev/sda1
/dev/sda2
/home/user/Desktop/file.txt
/home/user/Documents/file2.txt
/home/user/Documents/file3.txt

预期产出:

data structure containing paths: 
/usr
/dev
/home/user/Desktop/file.txt
/home/user/Documents/file2.txt
/home/user/Documents/file3.txt

3 个答案:

答案 0 :(得分:1)

一个简单的解决方案:

class Database {

  public void add(Path p) {
    for (int i = 0; i < paths.size(); i++) {
      Path p2 = paths.get(i);
      if (p2.startsWith(p)) {
        // replace with new path
        paths.set(i, p);
        return;
      }
      if (p.startsWith(p2)) {
        // don't add this new one
        return;
      }
    }
    // else, add the new one
    paths.add(p);
  }

  ArrayList<Path> paths = new ArrayList<>();

}

LinkedList实施:

class Database {

  public void add(Path p) {
    for (ListIterator<Path> it = paths.listIterator(0); it.hasNext();) {
      Path p2 = it.next();
      if (p2.startsWith(p)) {
        // replace with new path
        it.set(p);
        return;
      }
      if (p.startsWith(p2)) {
        // don't add this new one
        return;
      }
    }
    // else, add the new one
    paths.add(p);
  }

  LinkedList<Path> paths = new LinkedList<>();

}

答案 1 :(得分:1)

基于树的解决方案(可能更有效):

class Database {

  public void add(String p) {
    root.add(Arrays.asList(p.split("\\\\|/")), 0);
  }

  public void addAll(Collection<? extends String> list) {
    for (String p : list)
    add(p);
  }

  public List<String> getPathsList() {
    ArrayList<String> list = new ArrayList<>();
    root.listPaths(list, "");
    return list;
  }

  PathNode root = new PathNode("");

  static class PathNode {

    public final String name;
    public Map<String, PathNode> children = new HashMap<>();

    public PathNode(String name) {
      this.name = name;
    }

    public boolean isLeaf() {
      return children.size()==0;
    }

    public boolean isRoot() {
      return name.isEmpty();
    }

    public void add(List<String> path, int i) {
      String childName = path.get(i);
      PathNode child = children.get(childName);

      if (child != null) {
        if (path.size()-i <= 1) child.children.clear();
        else child.add(path, i+1);
      } else if (!isLeaf() || isRoot()) {
        PathNode node = this;
        for (; i < path.size(); i++) {
          String key = path.get(i);
          node.children.put(key, node = new PathNode(key));
        }
      }
    }

    public void listPaths(ArrayList<String> list, String prefix) {
      for (PathNode child : children.values()) {
        if (child.isLeaf()) list.add(prefix+child.name);
        else child.listPaths(list, prefix+child.name+File.separator);
      }
    }

  }

}

测试以验证正确性:http://ideone.com/cvqEVT

在任何平台上运行时,此实现都将接受Windows和Unix路径。 Database.getPathsList()返回的路径仍将使用操作系统的文件分隔符;您可以通过更改File.separator中的Database.PathNode.listPaths(实际代码的最后一行)来更改它。

答案 2 :(得分:0)

static ArrayList<Path> paths = new ArrayList<Path>();

public static void main (String[]args) { 
    add(Paths.get("/usr/bin"));
    add(Paths.get("/usr"));
    add(Paths.get("/dev"));
    add(Paths.get("/dev/sda"));
    add(Paths.get("/home/user/Desktop/file.txt"));
    System.out.println(paths.toString());
} 

public static void add(Path path){
    // get root
    String firstDir = path.subpath(0, 1).toString();
    // check all known paths
    for (int q = 0; q < paths.size(); q++){
        Path p = paths.get(q);
        // get root of saved path
        String pFirstDir = p.subpath(0, 1).toString();

        // do they have the same root path
        if (pFirstDir.equals(firstDir)){
            // the new path needs to have less folders otherwise return
            if (path.getNameCount()>p.getNameCount()){
                return;
            }

            // set the new path and return
            paths.set(q, path);
            return;
        }
    }
    // no paths found taht match so add
    paths.add(path);
}

将打印:

[\usr, \dev, \home\user\Desktop\file.txt]