我们需要一个脚本来比较两个文件目录,并且对于在目录1和目录2之间已经更改的每个文件(添加,删除,修改),需要只创建那些修改过的文件的子集。
我的第一印象是创建一个python脚本来遍历每个目录,计算每个文件的哈希值,如果哈希值已经更改,则将文件复制到新的文件子集。这是一种正确的方法吗?我是否忽略了那些已经可以做到这一点的工具?我从来没有使用它,但也许可以使用rsync这样的东西?
由于
修改
重要的是,我能够编译只有那些文件被更改的子集 - 所以如果版本之间只有3个文件发生了变化,我只需要将这三个文件复制到新目录......
答案 0 :(得分:3)
在我看来,你需要一些简单的东西:
from os.path import getmtime
from os import sep,listdir
rep1 = 'I:\\dada'
rep2 = 'I:\\didi'
R1 = listdir(rep1)
R2 = listdir(rep2)
vanished = [ filename for filename in R1 if filename not in R2]
appeared = [ filename for filename in R2 if filename not in R1]
modified = [ filename for filename in ( f for f in R2 if f in R1)
if getmtime(rep1+sep+filename)!=getmtime(rep2+sep+filename)]
print 'vanished==',vanished
print 'appeared==',appeared
print 'modified==',modified
答案 1 :(得分:2)
这是一种完全合理的方法,但您实际上是在重新发明rsync。所以是的,请使用rsync。
修改:There's a way to create "difference-only" folders using rsync
答案 2 :(得分:0)
我喜欢diffmerge,它很适合这个目的。
答案 3 :(得分:0)
我已经修改了@eyquem一些答案!
参数可以
给出python file.py dir1 dir2
注意:根据修改时间进行排序!
#!/usr/bin/python
import os, sys,time
from os.path import getmtime
from os import sep,listdir
ORIG_DIR = sys.argv[1]#orig:-->/root/backup.FPSS/bin/httpd
MODIFIED_DIR = sys.argv[2]#modified-->/FPSS/httpd/bin/httpd
LIST_OF_FILES_IN_ORIG_DIR = listdir(ORIG_DIR)
LIST_OF_FILES_IN_MODIFIED_DIR = listdir(MODIFIED_DIR)
vanished = [ filename for filename in LIST_OF_FILES_IN_ORIG_DIR if filename not in LIST_OF_FILES_IN_MODIFIED_DIR]
appeared = [ filename for filename in LIST_OF_FILES_IN_MODIFIED_DIR if filename not in LIST_OF_FILES_IN_ORIG_DIR]
modified = [ filename for filename in ( f for f in LIST_OF_FILES_IN_MODIFIED_DIR if f in LIST_OF_FILES_IN_ORIG_DIR) if getmtime(ORIG_DIR+sep+filename)<getmtime(MODIFIED_DIR+sep+filename)]
same = [ filename for filename in ( f for f in LIST_OF_FILES_IN_MODIFIED_DIR if f in LIST_OF_FILES_IN_ORIG_DIR) if getmtime(ORIG_DIR+sep+filename)>=getmtime(MODIFIED_DIR+sep+filename)]
def print_list(arg):
for f in arg:
print '----->',f
print 'Total :: ',len(arg)
print '###################################################################################################'
print 'Files which have Vanished from MOD: ',MODIFIED_DIR,' but still present ',ORIG_DIR,' ==>\n',print_list(vanished)
print '-----------------------------------------------------------------------------------------------------'
print 'Files which are Appearing in MOD: ',MODIFIED_DIR,' but not present ',ORIG_DIR,' ==>\n',print_list(appeared)
print '-----------------------------------------------------------------------------------------------------'
print 'Files in MOD: ',MODIFIED_DIR,' which are MODIFIED if compared to ORIG: ',ORIG_DIR,' ==>\n',print_list(modified)
print '-----------------------------------------------------------------------------------------------------'
print 'Files in MOD: ',MODIFIED_DIR,' which are NOT modified if compared to ORIG: ',ORIG_DIR,' ==>\n',print_list(same)
print '###################################################################################################'