Question

我正在尝试编写一个函数来复制C中的文件。我需要它来处理任何类型的文件，无论是文本，二进制文件还是其他格式。这就是我现在所拥有的，但似乎我的实现已被打破。有人可以指出我做错了什么以及如何解决它？

// Copies the file from source to destination and returns number of bytes written
ssize_t copy_file(char* source, char* destination, int size)
{
    if (source == NULL || destination == NULL || access(source, F_OK) == -1)
        return 0;

    int fd_to = open(destination, O_WRONLY | O_CREAT | O_TRUNC, 0777);
    int fd_from = open(source, O_RDONLY);
    char* buffer = malloc(sizeof(size));
    ssize_t written;

    if (fd_to < 0 | fd_from < 0)
        return 0;

    read(fd_from, buffer, size);
    written = write(fd_to, buffer, size);
    close(fd_to);
    close(fd_from);
    free(buffer);

    return written;
}

Answer 1

拥有与文件一样大的缓冲区对于大缓冲区值来说是不经济的（“大”取决于平台和操作系统，但我怀疑它超过，例如，在达到递减收益之前一兆字节）。在某些系统上，可以允许您分配比物理可用内存多得多的内容，缓冲区由磁盘上的交换区域备份。此时，如果您尝试一下子复制整个文件，最终可能会将大部分文件读取并写入交换区域，然后从交换区域读回新文件，有效地加倍（在至少）复制时间。

所以我会使用一个循环。

您还需要检查内存分配和文件写入中的错误，并认为大小为int可能会导致大文件出现问题（2 GB现在是可访问的文件大小，但它会溢出32位有符号整数）。

// Copies a part of a file from source to destination
// and returns number of bytes written.
// if input size is < 0, copies the whole file.

ssize_t copy_file(char* source, char* destination, int size)
{
    if ((source == NULL) || (destination == NULL) || (access(source, F_OK) == -1)) {
        return 0;
    }

    #define BUFFER_SIZE 1048576
    char* buffer = malloc(BUFFER_SIZE);
    if (NULL == buffer) {
        return 0;
    }

    int fd_from = open(source, O_RDONLY);
    if (fd_from < 0) {
        free(buffer);
        return 0;
    }
    int fd_to = open(destination, O_WRONLY | O_CREAT | O_TRUNC, 0777);
    if (fd_to < 0) {
        free(buffer);
        // Avoid leaking a file handle in case of error.
        close(fd_from);
        return 0;
    }

    ssize_t written = 0;
    // This checks that size is != 0.
    // As a result, passing a size < 0 will copy the whole source,
    // whatever its length.
    // The condition is written explicitly, deliberately (a simple
    // while(size) might be overlooked or mistaken for a bug).
    while((size > 0)||(size < 0)) {
        int ch_r;
        ch_r = read(fd_from, buffer, BUFFER_SIZE);
        if (ch_r) {
            if (ch_r != write(fd_to, buffer, ch_r)) {
                // Out of storage space?
                close(fd_from);
                close(fd_to);
                free(buffer);
                unlink(destination);
                return 0;
            }
        } else {
            // finished
            break;
        }
        written += ch_r;
        // We do have a problem of integer size. if
        // sizeof(int) is 4 (32bit), files or sizes larger than 2 GB will
        // likely misbehave.
        size -= ch_r;
    }
    close(fd_to);
    close(fd_from);
    free(buffer);
    return written;
}

此外，您可能会发现返回错误状态而不是大小很有用。如果返回零，则表示写入的字节数等于输入大小。如果需要返回两个值，可以将错误放在由指针传递的变量中：

ssize_t copy_file(char* source, char* destination, int size, int *status)
{
    *status = 0; // Begin with "no error"

    ...
    if (NULL == buffer) {
        *status = -8; // -8 stands for "out of memory"
        return 0;
    }

    ...

这样，如果出现错误，您将知道为什么例程返回零。此外，您还可以在需要时创建零长度文件（该函数将返回0，但状态也将为0，表示写入零字节不是错误）。

复制常规文件，无需指定文件大小：

// Copies a file from source to destination
// and returns number of bytes written.

ssize_t copy_file(char* source, char* destination)
{
    if ((source == NULL) || (destination == NULL) || (access(source, F_OK) == -1)) {
        return 0;
    }

    #define BUFFER_SIZE 1048576
    char* buffer = malloc(BUFFER_SIZE);
    if (NULL == buffer) {
        return 0;
    }

    int fd_from = open(source, O_RDONLY);
    if (fd_from < 0) {
        free(buffer);
        return 0;
    }
    int fd_to = open(destination, O_WRONLY | O_CREAT | O_TRUNC, 0777);
    if (fd_to < 0) {
        free(buffer);
        // Avoid leaking a file handle in case of error.
        close(fd_from);
        return 0;
    }

    ssize_t written = 0;

    // Infinite loop, exiting when nothing more can be read
    for(;;) {
        int ch_r;
        ch_r = read(fd_from, buffer, BUFFER_SIZE);
        if (ch_r) {
            if (ch_r != write(fd_to, buffer, ch_r)) {
                // Out of storage space?
                close(fd_from);
                close(fd_to);
                free(buffer);
                unlink(destination);
                return 0;
            }
        } else {
            // finished
            break;
        }
        written += ch_r;
    }
    close(fd_to);
    close(fd_from);
    free(buffer);
    return written;
}

Answer 2

sizeof(size)返回变量size的数据类型的大小，int通常为4 - 因此缓冲区总是包含4个字节。请改用malloc(size)。此外，您只读取和写入一个缓冲区 - 如果文件大于缓冲区大小，则需要使用循环来重复该过程。

另外，在||中使用|代替if (fd_to < 0 | fd_from < 0)进行逻辑OR。

Answer 3

我认为您需要复制任意长度的文件，并且传递给函数的 size 参数是缓冲区大小，而不是文件大小。您需要执行malloc(size)，而不是malloc(sizeof(size))，顺便说一句。最重要的是，你需要一个包含read（）和write（）的循环，比如

size_t rd_len, wr_len;
do {
    rd_len = read(fd_from, buffer, size);
    wr_len = write(fd_to, buffer, size);
    /* check that wr_len == rd_len */
    written += wr_len;
while (wr_len > 0);

Answer 4

这不是为什么它不起作用（假设它是正确的）但是 -

请勿尝试立即将整个文件读入内存。分配固定大小（1000字节）缓冲区并循环读取块并写入块直到文件末尾。

如何在C中复制文件？

4 个答案: