Question

假设我在大磁盘上有一个大文件，这个文件几乎完全填满了磁盘。例如10TB磁盘，差不多10TB文件，说3GB都是免费的。另外，我没有任何其他

我想把这个文件分成N个部分，但是对于简单的情况，分成两半是可以的。由于所需的解决方案可能特定于FS，我在ext4文件系统上。

我知道https://www.gnu.org/software/coreutils/manual/coreutils.html#split-invocation

显然，设备上没有足够的可用空间来通过复制来创建分割。

是否有可能以某种方式将文件A（~10TB）分成两个文件B和C，因此这些（B和C）将只是新的＆＃34;引用＆＃34;到文件A的原始数据。

即。 B具有相同的开始（A_start = B_start），但是较小的长度和C，从B_start + B_length开始，具有C_length = A_length-B_length。

操作后，FS中可能存在也可能不存在文件A. 另外，如果存在某些约束/限制，我就可以了，这样只能在某个扇区/块边界（即只有4096字节的栅格）。

同样的问题适用于相反的情况：

在10TB硬盘上有两个近5TB的文件：通过仅调整＆＃34; inode引用＆＃34;将这些文件连接到近10TB大小的结果文件。

如果命名法不准确，我很抱歉，我希望它能清楚我想要实现的目标。

Answer 1

首先，目前没有保证可移植的方式来做你想做的事情 - 任何解决方案都将是特定于平台的，因为要做你想要的事情要求你的底层文件系统支持稀疏文件。

如果底层文件系统创建稀疏文件（为了清楚起见，遗漏了正确的标题和错误检查），这样的代码可以将文件分成两半：

// 1MB chunks (use a power of two)
#define CHUNKSIZE ( 1024L * 1024L )
int main( int argc, char **argv )
{
    int origFD = open( argv[ 1 ], O_RDWR );
    int newFD = open( argv[ 2 ], O_WRONLY | O_CREAT | O_TRUNC, 0644 );

    // get the size of the input file
    struct stat sb;
    fstat( origFD, &sb );

    // get a CHUNKSIZE-aligned offset near the middle of the file
    off_t startOffset = ( sb.st_size / 2L ) & ~( CHUNKSIZE - 1L );

    // get the largest CHUNKSIZE-aligned offset in the file
    off_t readOffset = sb.st_size & ~( CHUNKSIZE - 1L );

    // might have to malloc() if it doesn't fit on the stack
    char *ioBuffer[ CHUNKSIZE ];

    while ( readOffset >= startOffset )
    {
        // write the data to the end of the file - the underlying
        // filesystem had better create a sparse file or this can
        // fill up the disk on the first pwrite() call
        ssize_t bytesRead = pread(
            origFD, ioBuffer, CHUNKSIZE, readOffset );

        ssize_t bytesWritten = pwrite(
            newFD, ioBuffer, byteRead, readOffset - startOffset );

        // cut the end off the input file - this had better free up
        // disk space
        ftruncate( origFD, readOffset );
        readOffset -= CHUNKSIZE;
    }

    free( ioBuffer );
    close( origFD );
    close( newFD );
    return( 0 );
}

还有其他方法。在Solaris系统上，您可以在use fcntl() with the F_FREESPC command和支持FALLOC_FL_PUNCH_HOLE的Linux系统上使用the fallocate() function在将数据复制到另一个文件后从文件中删除任意块。在此类系统上，您不能仅限于使用ftruncate()剪掉原始文件的结尾。

ext4：原位拆分和连接文件

1 个答案: