rsync with --remove-sent-files选项和打开的文件

时间:2012-07-16 09:50:54

标签: linux file rsync

每分钟我都需要将录制的文件从3台服务器复制到一台数据存储器。我不需要保存原始文件 - 数据处理不在所有文件中。

但是当我使用选项--remove-sent-files时,rsync会发送并删除未完成(未关闭)的文件。

我试图阻止使用lsof--exclude-from发送这些打开的文件,但似乎rsync不会在exlude列表中解除完整路径:

--exclude-from=FILE     read exclude >>patterns<< from FILE

lsof | grep /projects/recordings/.\\+\\.\\S\\+ -o | sort | uniq
/projects/recordings/<uid>/<path>/2012-07-16 13:24:32.646970-<id>.WAV

所以,脚本看起来像:

# get open files in src dir and put them into rsync.exclude file
lsof | grep /projects/recordings/.\\+\\.\\S\\+ -o | sort | uniq > /tmp/rsync.exclude
# sync without these files
/usr/bin/rsync -raz --progress --size-only --remove-sent-files --exclude-files=/tmp/rsync.excldude /projects/recordings/ site.com:/var/www/storage/recordings/
# change owner
ssh storage@site.com chown -hR storage:storage /var/www/storage/recordings

那么,我可能会尝试其他工具吗?或者为什么rsync不听exlude?

2 个答案:

答案 0 :(得分:5)

我不确定这是否对您有所帮助,但这是我对目前尚未写入的rsync文件的解决方案。我用它来进行tshark捕获,每隔N秒使用-a标志写入一个新文件(例如tshark -i eth0 -a duration:30 -w / foo / bar / caps)。注意那个棘手的rsync,包含和排除的顺序很重要,如果我们想要子目录,我们需要包含“* /”。

-G

$save_path=/foo/bar/
$delay_between_syncs=30
while true;
do
 sleep $delay_between_syncs

 # Calculate which files are currently open (i.e. the ones currently being written to)
 # and avoid uploading it. This is to ensure that when we process files on the server, they
 # are complete.
 echo "" > /tmp/include_list.txt
 for i in `find $save_path/ -type f`
  do
    op=`fuser $i`
    if [ "$op" == "" ]
            then
                    #echo [+] $i is good for upload, will add it list.
                    c=`echo $i | sed 's/.*\///g'`
                    echo $c >> /tmp/include_list.txt
    fi
  done

 echo [+] Syncing...
 rsync -rzt --include-from=/tmp/include_list.txt --include="*/" --exclude \* $save_path user@server:/home/backup/foo/
 echo [+] Sunk... 

done

答案 1 :(得分:0)

rsync文件,然后通过捕获传输文件列表删除已经rsync'd的文件,然后仅删除当前未打开的传输文件。 Rsync确定了当它到达目录时要传输的文件,因此,即使它最初工作,当新打开的文件(因为rsync启动)不在排除列表中时,您的解决方案也必然会失败。

另一种方法是做一个

找到dir -type f -name模式-mmin +10 | xargs -i rsync -aP {} dest:/ path / to / backups