我已经广泛搜索了VBscript的答案,但已经放弃并需要帮助。
我想要完成的是找到具有不同文件名的明显重复的文件(对于人类来说显而易见)。我需要删除重复项,保留名称中没有轨道号的那些副本。如果我已经在MP3中使用它,我还需要删除任何M4A版本。
这甚至可能吗?我做了一点VBscripting,但这是我的有限编程能力。我甚至不打算在这里复制我尝试过的代码,因为它都没有用。
这是我正在尝试清理的示例文件夹。我只想留下这里的两首独特歌曲。我只想要MP3版本,我不想要他们名字中的曲目编号。
07坠入爱河(难以理解K.mp3) 1-15爱在电梯里.m4a
1-15爱在Elevator.mp3中 15爱在Elevator.mp3中 2-07坠入爱河(在The.m4a上很难 2-07坠入爱河(很难在The.mp3上 坠入爱河(在膝盖上很难).mp3
爱在Elevator.mp3中
谢谢!
答案 0 :(得分:0)
这不是一项简单的任务,因为基本上你想要测量不同文件名的相似性/接近度。我的外行方法是从文件名中提取标题,对其进行标准化,然后使用最短的左对齐进行比较。这样的事情可能有用:
Set fso = CreateObject("Scripting.FileSystemObject")
Set re = New RegExp
re.Pattern = "^\d+(-\d+)?\s+"
Set rs = CreateObject("ADOR.Recordset")
rs.Fields.Append "NormalizedName", 200, 255
rs.Fields.Append "Length", 3
rs.Fields.Append "Path", 200, 255
rs.Open
' Store the full paths of the files and their associated normalized name in
' a disconnected recordset. The "Length" field is used for sorting (see below).
For Each f In fso.GetFolder("C:\some\folder").Files
normalizedName = LCase(re.Replace(fso.GetBaseName(f.Name), ""))
rs.AddNew
rs("NormalizedName").Value = normalizedName
rs("Length").Value = Len(normalizedName)
rs("Path").Value = f.Path
rs.Update
Next
' sort to ensure that the shortest normalized name always comes first
rs.Sort = "NormalizedName, Length ASC"
ref = ""
Set keeplist = CreateObject("Scripting.Dictionary")
rs.MoveFirst
Do Until rs.EOF
path = rs("Path").Value
name = rs("NormalizedName").Value
currentExtension = LCase(fso.GetExtensionName(path))
If ref <> "" And ref = Left(name, Len(ref)) Then
' same title as last file, so check if this one is a better match
If extension <> "mp3" And currentExtension = "mp3" Then
' always pick MP3 version if it exists
keeplist(ref) = path
extension = currentExtension
ElseIf extension = currentExtension _
And IsNumeric(Left(fso.GetBaseName(keeplist(ref)), 1)) _
And Not IsNumeric(Left(fso.GetBaseName(path), 1)) Then
' prefer file names not starting with a number when they have the
' same extension
keeplist(ref) = path
End If
Else
' first file or different reference name
ref = name
extension = currentExtension
keeplist.Add ref, path
End If
rs.MoveNext
Loop
rs.Close
For Each ref In keeplist
WScript.Echo keeplist(ref)
Next
我很确定上面的代码没有涵盖边缘情况,所以要小心处理。另请注意,代码只处理单个文件夹。要处理文件夹树,需要其他代码(例如,请参阅here)。