比较文件名和显示大多数单词匹配的文件?

时间:2017-10-22 05:36:54

标签: batch-file

我需要一种方法来计算文件名中匹配单词的数量,以确定这些文件是否可能远离同一主题。我知道如何将文件名转换为变量......

set count=0
for %%i in (%filename%) do set "word%count%=%%i" && set /a "count+=1"

...但是我需要一种方法来将它与大量文件进行比较并显示最佳匹配,这有点超出了我的技能水平。我至少需要向正确的方向努力才能让我开始。

这是我的意思的一个例子;这个只有5个要比较的文件的例子

From Dusk Till Dawn (1996) Robert Rodriguez [Horror, Action, Thriller, Crime] r6.9 1080p x265 AAC tt0116367.mkv
Full Metal Jacket (1987) Stanley Kubrick [Drama, War] r7.8 1080p x265 AAC tt0093058.mkv
Full Metal Jacket LOCKED AND LOADED Fanedit (1987) Stanley Kubrick [Drama, War] r7.8 720p x264 AC3 tt0093058.mkv
Desperado (1995) Robert Rodriguez [Thriller, Action, Crime] r6.8 1080p x265 AAC tt0112851.mkv
King of New York (1990) Abel Ferrara [Thriller, Crime] r6.5 1080p x265 AAC tt0099939.mp4

它应该能够处理整个文件目录树。结果对应按匹配数量的顺序列出:

10 words match
Full Metal Jacket (1987) Stanley Kubrick [Drama, War] r7.8 1080p x265 AAC tt0093058.mkv
Full Metal Jacket LOCKED AND LOADED Fanedit (1987) Stanley Kubrick [Drama, War] r7.8 720p x264 AC3 tt0093058.mkv

8 words match
From Dusk Till Dawn (1996) Robert Rodriguez [Horror, Action, Thriller, Crime] r6.9 1080p x265 AAC tt0116367.mkv
Desperado (1995) Robert Rodriguez [Thriller, Action, Crime] r6.8 1080p x265 AAC tt0112851.mkv

5 words match
From Dusk Till Dawn (1996) Robert Rodriguez [Horror, Action, Thriller, Crime] r6.9 1080p x265 AAC tt0116367.mkv
King of New York (1990) Abel Ferrara [Thriller, Crime] r6.5 1080p x265 AAC tt0099939.mp4

..and so on

我想要一定数量的必需匹配项,例如,不要显示少于6个匹配单词的文件。

2 个答案:

答案 0 :(得分:1)

@ECHO OFF
SETLOCAL
set /a count=0
for %%i in (*) do set /a count+=1&CALL set "word%%count%%=%%i" 
SET wo
GOTO :EOF

我将文件掩码更改为*以适合我的系统。

set /a不需要引号并忽略空格。使用引号确保尾随空格不包含在字符串值中。

在分配文件名之前移动set count可确保从and ending at count`开始编号。

calling set解析set,因此%%i将替换为其值,因为它是一个元变量,另一个%% } {s}被%替换为%转义%set执行为'设置"字%count%= %% i "的价值'

set wo显示所有以wo

开头的变量

答案 1 :(得分:1)

您应该注意StackOverflow不是免费的代码编写服务。但是,这个问题对我来说很有意思,所以我做了一个例外......

@echo off
setlocal EnableDelayedExpansion

set "minMatch=5"

rem Process all files in current directory
set /A "i=0, maxMatch=0"
for /F "delims=" %%i in ('dir /A-D /B') do (
   set /A i+=1
   set "file[!i!]=%%~NXi"

   rem Compare this file vs. all files below it
   set "j=0"
   for /F "delims=" %%j in ('dir /A-D /B') do (
      set /A j+=1
      if !j! gtr !i! (

         rem Compare words, count match and store this pair of names
         set "n=0"
         for %%a in (%%~NXi) do for %%b in (%%~NXj) do (
            if /I "%%a" equ "%%b" set /A n+=1
         )
         if !n! geq %minMatch% (
            for %%n in (!n!) do set "match[%%n]=!match[%%n]! !i!+!j!"
            if !n! gtr !maxMatch! set /A maxMatch=n
         )

      )
   )

)

rem Show results
for /L %%n in (%maxMatch%,-1,%minMatch%) do if defined match[%%n] (
   echo %%n words match
   for %%m in (!match[%%n]!) do for /F "tokens=1,2 delims=+" %%i in ("%%m") do (
      echo !file[%%i]!
      echo !file[%%j]!
      echo/
   )
   echo/
)

输出示例:

10 words match
Full Metal Jacket (1987) Stanley Kubrick [Drama, War] r7.8 1080p x265 AAC tt0093058.mkv
Full Metal Jacket LOCKED AND LOADED Fanedit (1987) Stanley Kubrick [Drama, War] r7.8 720p x264 AC3 tt0093058.mkv


7 words match
From Dusk Till Dawn (1996) Robert Rodriguez [Horror, Action, Thriller, Crime] r6.9 1080p x265 AAC tt0116367.mkv
Desperado (1995) Robert Rodriguez [Thriller, Action, Crime] r6.8 1080p x265 AAC tt0112851.mkv


5 words match
Desperado (1995) Robert Rodriguez [Thriller, Action, Crime] r6.8 1080p x265 AAC tt0112851.mkv
King of New York (1990) Abel Ferrara [Thriller, Crime] r6.5 1080p x265 AAC tt0099112851.mkv

您必须意识到此程序中涉及的操作数量增长的因素取决于文件数量和每个文件中的字数。如果要处理的文件数量很大,则此程序可能需要很长时间才能完成...