批处理:比较目录列表txt文件

时间:2017-07-20 11:42:02

标签: windows batch-file

Windows 7/8/10(x86 / x64)
窗口批次

大家好,

首先,我的任务是:

每天使用Robocopy将音乐库驱动器备份到辅助驱动器 由于可能会删除音乐库中的文件,这两个文件不能是镜像克隆。只有不同/新文件被复制到备份驱动器。

有时会发现文件位于音乐库的不正确文件夹中,并被移动到正确的文件夹中。启动备份脚本后,它会将这些“新”文件及其“新”文件夹复制到备份驱动器中。然后备份驱动器有两个文件副本,一个在树中正确的“新”位置,另一个是旧的错误位置。旧的是不必要的副本。

到目前为止,我设法编写了以下内容:

  1. 运行最新的Robocopy日志文件(RClog),将源文件路径转换为备份驱动器上的正确路径。所以可以比较。
  2. DIR备份驱动器并转储到文本文件。
  3. 将转换后的RClog放入变量的“数组”中。
  4. 遍历DIR列表中的每一行,并将文件名与数组进行比较。如果匹配,请查看完整路径是否匹配。如果不是完整路径匹配,那么它可能是*重复。输出到txt文件的完整路径。
  5. 所以,我现在的代码如下 然而,这是非常缓慢的。我想这是因为我的表演接近100万!在RClog中有大约10k个mp3文件和大约100个新复制的文件。

    我想我需要先使用较小的文件列表RClog启动,然后在备份驱动器的DIR_List中找到这些文件。但是,我注意到FINDSTR不喜欢反冲并需要它们逃逸。这将需要一个FOR循环来进行变量子串替换,这可能会减慢两个列表的速度。

    我必须执行一项最终任务,这将是MD5哈希建议的不需要的重复文件,并查看它是否与新复制的文件相同。 *可能是它碰巧共享相同的名称,但来自不同的艺术家,因此需要进行检查。在某些方面可能需要一些人为干预,但在大多数情况下,自动化是关键,并且更快的自动化。

    虽然我真的更愿意保留这个没有第三方工具的纯批次,但我会考虑使用PowerShell,只要它达到v3.0。我不想更新系统上的东西,并强制更改。我正在尝试使用那里的东西,并且可以使用纯批次。

    我提前为这种相当随意的评论风格道歉。

    提前感谢任何贡献。

    ::@ECHO OFF
    
    @ECHO:
    @ECHO: ===========================================================
    @ECHO:
    @ECHO:         Music_library backup duplicate files finder
    @ECHO:
    @ECHO: Finds duplicate files to those included in the specified
    @ECHO: robocopy log and lists them in a text file on the desktop
    @ECHO:
    @ECHO: ===========================================================
    @ECHO:
    
    
    :: Define Duplicates log
        SET "DUPE_CHECK_LOG=%userprofile%\desktop\duplicates_log.txt"
    
    :: Define Directory to scan
        SET "IN_DIR=U:\Backups\Music_Library"
    
    :: Define temporary DIR txt file
    SET "TMP_DIR_TXT=%TMP%\Temp_Dir_list.txt"
    
    :: FOR THE ROBOCCOPY LOG COPIED FILE PATH CONVERTER
    :: Define Temporary robocopy log converted path file
        SET "TMP_RC=%TMP%\Temp_RC_Conv.txt"
    
    :: Define common root folder between both locations
    :: copied from (in the Rc log) and backup location.
        SET "CommonSplit=\Digital_Music_Library\"
    :: END SECTION
    
    :: Put rc log file path into a variable for easier reading
        SET /P "RC_LOG=Drag and drop desired Robocopy Log here > "
    
    
    :: strip rc log down to name.ext only
        FOR /F "tokens=*" %%a IN ("%RC_LOG%") DO SET "RC_LOG_name=%%~nxa"
    
    :: Print rc log name to new entry in dupes log
        (
        @ECHO:
        @ECHO:  %RC_LOG_name%
        )>>"%DUPE_CHECK_LOG%"
    
    :: Debug
    ::PAUSE
    
    :: Delete old converted robocopy log if present
        IF EXIST "%TMP_RC%" DEL /Q "%TMP_RC%"
    
    :: Delete old dir txtif present
        IF EXIST "%TMP_DIR_TXT%" DEL /Q "%TMP_DIR_TXT%"
    
    :: Convert Robocopy Log to a list of those files in their new location
    :: on the target drive.  RC log lists 'from' location not 'to', so all
    :: files wouldn't share the same path - different drive altogether.
    :: Need to string path of copied files down to a common root folder 
    :: between the two locations.  eg. Digital_Music_Library
    
    FOR /F "tokens=3 delims=    " %%a IN ('FINDSTR /C:"New File" "%RC_LOG%"') DO (
    CALL :RC_CONVERTER "%%~fa"
    )
    
    :: Debug
    ::PAUSE
    
    :: Take all of the data from text file and jam it into an array of variables
    :: Each number of the var corresponds to that line in the text file...sheesh!
    :: One var name for full path, the other for filenameext.
    :: I just don't want yet another FOR loop to play with paramters/arguments - eg. %%~dpnxa
    :: Shove this bit after the RC log is created, but really before anything else.
    :: EnableDelayedExpansion must persist, can't ENDLOCAL on it yet, as it'll clear the var array!
    
    SETLOCAL EnableDelayedExpansion
    SET /A i=0
    FOR /F "usebackq tokens=*" %%a IN ("%TMP_RC%") DO (
        SET /A i+=1
        SET "ArrayFullPath[!i!]=%%~fa"
        SET "ArrayFilenameExt[!i!]=%%~nxa"
    )
    
    :: debug
    ::PAUSE
    
    
    
    :: Dumps contents of music library into a txt file, all mp3s.
    DIR /b/s "%IN_DIR%\*.mp3">>"%TMP_DIR_TXT%"
    
    :: Loop through each line in Dir.txt checking if file is dupe
    FOR /F "usebackq tokens=*" %%a IN ("%TMP_DIR_TXT%") DO (
    CALL :CHECKER "%%~fa"
    REM PAUSE
    )
    
    
    :: End out commands
    
    :: Delete old converted robocopy log if present
        IF EXIST "%TMP_RC%" DEL /Q "%TMP_RC%"
    
    :: Delete old dir txtif present
        IF EXIST "%TMP_DIR_TXT%" DEL /Q "%TMP_DIR_TXT%"
    
    @ECHO:
    @ECHO: Finished finding duplicates.
    ENDLOCAL
    PAUSE
    GOTO :EOF
    
    
    :CHECKER
    :: Start at 1 and go to the maximum number of array vars created earlier.
    :: For each one compare the found filename.ext from above against the filename.ext in the array
    :: If it matches, make sure the path doesn't, and then print full path to duplicate file to a text file.
    :: End subroutine and go back for the next found file.
    FOR /L %%a IN (1,1,%i%) DO (
        IF /I "!ArrayFilenameExt[%%a]!" == "%~nx1" (
            IF /I NOT "!ArrayFullPath[%%a]!" == "%~f1" (
                FOR /F "tokens=*" %%z IN ("%~f1") DO (
                    ECHO %%~fz>>"%DUPE_CHECK_LOG%"
                    )
                )
            )
    )
    EXIT /B
    
    :RC_CONVERTER
    :: Take the full path from the RC log sent as paramter/argument
    :: Use variable substring substitution to remove everything before and 
    :: including the common folder.
    :: Print out to text file target drive, path, common folder and remaining path of 
    :: copied files.
        SET "TMP_STR=%~f1"
        CALL SET "TempStr=%%TMP_STR:*%CommonSplit%=%%"
        FOR /F "tokens=*" %%g in ("%TempStr%") DO (
        @ECHO %IN_DIR%%CommonSplit%%%~g>>"%TMP_RC%"
        )
    EXIT /B
    

0 个答案:

没有答案