Question

我正在使用Apache Commons 1.4.1库解压缩“.tar”文件。

问题：我不必提取所有文件。我必须从tar存档中的特定位置提取特定文件。我只需提取几个.xml文件，其中TAR文件的大小约为300 MB＆amp;在解压缩整个内容时浪费资源。

我被困了困惑我是否必须进行嵌套目录比较或是否有任何解决方法？

注意： .XML（所需文件）的位置始终相同。

TAR的结构是：

directory:E:\Root\data
 file:E:\Root\datasheet.txt
directory:E:\Root\map
     file:E:\Root\mapers.txt
directory:E:\Root\ui
     file:E:\Root\ui\capital.txt
     file:E:\Root\ui\info.txt
directory:E:\Root\ui\sales
     file:E:\Root\ui\sales\Reqest_01.xml
     file:E:\Root\ui\sales\Reqest_02.xml
     file:E:\Root\ui\sales\Reqest_03.xml
     file:E:\Root\ui\sales\Reqest_04.xml
directory:E:\Root\ui\sales\stores
directory:E:\Root\ui\stores
directory:E:\Root\urls
directory:E:\Root\urls\fullfilment
     file:E:\Root\urls\fullfilment\Cams_01.xml
     file:E:\Root\urls\fullfilment\Cams_02.xml
     file:E:\Root\urls\fullfilment\Cams_03.xml
     file:E:\Root\urls\fullfilment\Cams_04.xml
directory:E:\Root\urls\fullfilment\profile
directory:E:\Root\urls\fullfilment\registration
     file:E:\Root\urls\options.txt
directory:E:\Root\urls\profile

约束：我无法使用JDK 7＆amp;必须坚持使用Apache commons库。

我目前的解决方案：

public static void untar(File[] files) throws Exception {
        String path = files[0].toString();
        File tarPath = new File(path);
        TarEntry entry;
        TarInputStream inputStream = null;
        FileOutputStream outputStream = null;
        try {
            inputStream = new TarInputStream(new FileInputStream(tarPath));
            while (null != (entry = inputStream.getNextEntry())) {
                int bytesRead;
                System.out.println("tarpath:" + tarPath.getName());
                System.out.println("Entry:" + entry.getName());
                String pathWithoutName = path.substring(0, path.indexOf(tarPath.getName()));
                System.out.println("pathname:" + pathWithoutName);
                if (entry.isDirectory()) {
                    File directory = new File(pathWithoutName + entry.getName());
                    directory.mkdir();
                    continue;
                }
                byte[] buffer = new byte[1024];
                outputStream = new FileOutputStream(pathWithoutName + entry.getName());
                while ((bytesRead = inputStream.read(buffer, 0, 1024)) > -1) {
                    outputStream.write(buffer, 0, bytesRead);
                }
                System.out.println("Extracted " + entry.getName());
            }

        }

Answer 1

TAR文件格式设计为以流的形式写入或读取（即，往/来自磁带驱动器），并且没有集中式标头。所以不，没有办法阅读整个文件以提取单个条目。

如果您想要随机访问，则应使用ZIP格式，并使用JDK ZipFile打开。假设你有足够的虚拟内存，该文件将被内存映射，使得随机访问速度非常快（如果无法进行内存映射，我还没看过它是否会使用随机访问文件）。

如何使用apache commons从TAR解压缩特定文件？

1 个答案: