每分钟我都需要将录制的文件从3台服务器复制到一台数据存储器。我不需要保存原始文件 - 数据处理不在所有文件中。
但是当我使用选项--remove-sent-files
时,rsync会发送并删除未完成(未关闭)的文件。
我试图阻止使用lsof
和--exclude-from
发送这些打开的文件,但似乎rsync不会在exlude列表中解除完整路径:
--exclude-from=FILE read exclude >>patterns<< from FILE
lsof | grep /projects/recordings/.\\+\\.\\S\\+ -o | sort | uniq
/projects/recordings/<uid>/<path>/2012-07-16 13:24:32.646970-<id>.WAV
所以,脚本看起来像:
# get open files in src dir and put them into rsync.exclude file
lsof | grep /projects/recordings/.\\+\\.\\S\\+ -o | sort | uniq > /tmp/rsync.exclude
# sync without these files
/usr/bin/rsync -raz --progress --size-only --remove-sent-files --exclude-files=/tmp/rsync.excldude /projects/recordings/ site.com:/var/www/storage/recordings/
# change owner
ssh storage@site.com chown -hR storage:storage /var/www/storage/recordings
那么,我可能会尝试其他工具吗?或者为什么rsync不听exlude? p>
答案 0 :(得分:5)
我不确定这是否对您有所帮助,但这是我对目前尚未写入的rsync文件的解决方案。我用它来进行tshark捕获,每隔N秒使用-a标志写入一个新文件(例如tshark -i eth0 -a duration:30 -w / foo / bar / caps)。注意那个棘手的rsync,包含和排除的顺序很重要,如果我们想要子目录,我们需要包含“* /”。
-G
$save_path=/foo/bar/
$delay_between_syncs=30
while true;
do
sleep $delay_between_syncs
# Calculate which files are currently open (i.e. the ones currently being written to)
# and avoid uploading it. This is to ensure that when we process files on the server, they
# are complete.
echo "" > /tmp/include_list.txt
for i in `find $save_path/ -type f`
do
op=`fuser $i`
if [ "$op" == "" ]
then
#echo [+] $i is good for upload, will add it list.
c=`echo $i | sed 's/.*\///g'`
echo $c >> /tmp/include_list.txt
fi
done
echo [+] Syncing...
rsync -rzt --include-from=/tmp/include_list.txt --include="*/" --exclude \* $save_path user@server:/home/backup/foo/
echo [+] Sunk...
done
答案 1 :(得分:0)
rsync文件,然后通过捕获传输文件列表删除已经rsync'd的文件,然后仅删除当前未打开的传输文件。 Rsync确定了当它到达目录时要传输的文件,因此,即使它最初工作,当新打开的文件(因为rsync启动)不在排除列表中时,您的解决方案也必然会失败。
另一种方法是做一个
找到dir -type f -name模式-mmin +10 | xargs -i rsync -aP {} dest:/ path / to / backups