Question

好吧，这里的问题很简单。我正在编写一个简单的备份代码。它工作正常，除非文件中有空格。这就是我找到文件并将它们添加到tar存档的方式：

find . -type f | xargs tar -czvf backup.tar.gz

问题是当文件在名称中有空格时，因为tar认为它是一个文件夹。基本上有一种方法可以在find的结果周围添加引号吗？或者用另一种方法解决这个问题？

Answer 1

使用此：

find . -type f -print0 | tar -czvf backup.tar.gz --null -T -

它会：

处理包含空格，换行符，前导破折号和其他有趣的文件
处理无限数量的文件
不会重复覆盖您的backup.tar.gz，例如当您拥有大量文件时tar -c使用xargs

另见：

GNU tar manual
How can I build a tar from stdin?，搜索null

Answer 2

可能有另一种方法可以达到你想要的效果。基本上，

使用 find 命令输出您要查找的文件的路径。将 stdout 重定向到您选择的文件名。
然后使用-T选项tar，它允许它获取文件位置列表（您刚刚使用find创建的位置！）
```
find . -name "*.whatever" > yourListOfFiles
tar -cvf yourfile.tar -T yourListOfFiles
```

Answer 3

尝试跑步：

    find . -type f | xargs -d "\n" tar -czvf backup.tar.gz

Answer 4

为什么不：

tar czvf backup.tar.gz *

确实使用find然后使用xargs很聪明，但是你正在努力做到这一点。

更新：Porges评论了一个find-option，我觉得这个答案比我的答案更好，或者另一个：find -print0 ... | xargs -0 ....

Answer 5

如果您有多个文件或目录，并且想要将它们压缩到独立的*.gz文件中，则可以执行此操作。可选-type f -atime

find -name "httpd-log*.txt" -type f -mtime +1 -exec tar -vzcf {}.gz {} \;

这将压缩

httpd-log01.txt
httpd-log02.txt

到

httpd-log01.txt.gz
httpd-log02.txt.gz

Answer 6

为什么不试试这样的事情：tar cvf scala.tar `find src -name *.scala`

Answer 7

另见here的解决方案：

find var/log/ -iname "anaconda.*" -exec tar -cvzf file.tar.gz {} +

Answer 8

最佳解决方案似乎是创建文件列表然后归档文件，因为您可以使用其他来源并对列表执行其他操作。

例如，这允许使用列表来计算要归档的文件的大小：

#!/bin/sh

backupFileName="backup-big-$(date +"%Y%m%d-%H%M")"
backupRoot="/var/www"
backupOutPath=""

archivePath=$backupOutPath$backupFileName.tar.gz
listOfFilesPath=$backupOutPath$backupFileName.filelist

#
# Make a list of files/directories to archive
#
echo "" > $listOfFilesPath
echo "${backupRoot}/uploads" >> $listOfFilesPath
echo "${backupRoot}/extra/user/data" >> $listOfFilesPath
find "${backupRoot}/drupal_root/sites/" -name "files" -type d >> $listOfFilesPath

#
# Size calculation
#
sizeForProgress=`
cat $listOfFilesPath | while read nextFile;do
    if [ ! -z "$nextFile" ]; then
        du -sb "$nextFile"
    fi
done | awk '{size+=$1} END {print size}'
`

#
# Archive with progress
#
## simple with dump of all files currently archived
#tar -czvf $archivePath -T $listOfFilesPath
## progress bar
sizeForShow=$(($sizeForProgress/1024/1024))
echo -e "\nRunning backup [source files are $sizeForShow MiB]\n"
tar -cPp -T $listOfFilesPath | pv -s $sizeForProgress | gzip > $archivePath

Answer 9

将在@Steve Kehlet post上添加评论，但需要50代表（RIP）。

对于通过无数次搜索找到这篇文章的人，我找到了一种方法，不仅可以找到给定时间范围内的特定文件，而且还不包括会引起标靶错误的相对路径或空格。（非常感谢您。）

find . -name "*.pdf" -type f -mtime 0 -printf "%f\0" | tar -czvf /dir/zip.tar.gz --null -T -

.相对目录
-name "*.pdf"寻找pdf（或任何文件类型）
-type f类型要查找的是文件
-mtime 0查找最近24小时内创建的文件
-printf "%f\0"常规-print0或-printf "%f"对我不起作用。从手册页：

该引用的执行方式与GNU ls相同。这与用于-ls和-fls的引用机制不同。如果您能够决定用于find输出的格式，那么通常最好使用'\ 0'作为终止符，而不是使用换行符，因为文件名可以包含空格和换行符。

-czvf创建档案，通过gzip过滤档案，详细列出已处理的文件，档案名称

Answer 10

一些解决方案（以及您自己的测试）的警告：

执行操作时：执行任何操作| xargs一些东西

xargs将尝试在“某物”之后插入“尽可能多的参数”，但随后您可能最终会多次调用“某物”。

因此，您的尝试：查找... | xargs tar czvf file.tgz 可能在xargs每次“ tar”调用时最终覆盖“ file.tgz”，而您最终只能进行最后一次调用！（选择的解决方案使用GNU -T特殊参数来避免此问题，但并非每个人都可以使用GNU tar）

您可以改为：

find . -type f -print0 | xargs -0 tar -rvf backup.tar
gzip backup.tar

关于cygwin的问题的证明：

$ mkdir test
$ cd test
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs touch 
    # create the files
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs tar czvf archive.tgz
    # will invoke tar several time as it can'f fit 10000 long filenames into 1
$ tar tzvf archive.tgz | wc -l
60
    # in my own machine, I end up with only the 60 last filenames, 
    # as the last invocation of tar by xargs overwrote the previous one(s)

# proper way to invoke tar: with -r  (which append to an existing tar file, whereas c would overwrite it)
# caveat: you can't have it compressed (you can't add to a compressed archive)
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs tar rvf archive.tar #-r, and without z
$ gzip archive.tar
$ tar tzvf archive.tar.gz | wc -l
10000 
  # we have all our files, despite xargs making several invocations of the tar command

注意：xargs的行为是众所周知的，这也是为什么当有人想要做的时候：

find .... | xargs grep "regex"

他们需要编写它：

find ..... | xargs grep "regex" /dev/null

这样，即使xargs上次调用grep时仅附加了一个文件名，grep仍会看到至少2个文件名（每次都有：/dev/null，在其中找不到任何内容，以及{{ 1}}后接xargs），因此在出现“ regex”时总是显示文件名。否则，您可能最终得到的最后结果显示匹配项，但前面没有文件名。

查找文件并对它们进行tar（带空格）

10 个答案: