我有一个文件名列表和一个我希望彼此匹配的标题列表。 (对于电视节目跟踪应用程序我写作)
示例:
[Commie] Psycho-Pass 2 - 01 [495A3950].mkv //filename
Psycho-Pass 2 // title it should be matched to
[UTW]_Fate_Kaleid_Liner_Prisma_Ilya_2wei_-_01_[h264-720p][34F564F6].mkv
Fate Kaleid Liner Prisma Ilya 2wei
The.Big.Bang.Theory.S08E05.720p.HDTV.X264-DIMENSION[rartv]
The Big Bang Theory
Modern.Family.S06E03.720p.HDTV.x264-KILLERS[rartv]
Modern Family
我认为regexp是一个有点繁琐的解决方案,因为文件名格式并不总是相同的。我正在考虑比较系统将根据置信度测量(百分比阈值)来决定。实际标题是在数据库中预定义的(没有剧集编号)。我基本上需要将文件名与标题匹配。
如果没有必要,我不想走机器学习之路;)
有什么想法吗?
答案 0 :(得分:0)
以下简单方法不会起作用吗?
for each $title
$count = 0
for each $word in $title
if $word in $filename:
$count++
/* additive error */
if count >= (number of words in title) - $some_alpha:
/* found matching title */
/* multiplicative error */
if count / (number of words in title) >= $some_percentage:
/* found matching title */
或者您正在寻找更复杂的东西吗?
答案 1 :(得分:0)
经过一番研究后,我偶然发现了levenshtein
php方法:
http://php.net/manual/en/function.levenshtein.php
由于我已经使用Show名称填充了数据库,并且我只想匹配文件名,我可以使用该方法迭代每个节目名称并选择最合适的位置!
答案 2 :(得分:0)
根据您的文字,您有一个数据库,其中包含已存储的标题列表。现在您想要将它们与file_names匹配。下面,我有代码可以做到这一点。我已经使用了匹配,并且如果它们匹配,你会把事情做的地方不匹配。
您要做的第一件事就是清理文件名,然后将标题与文件名匹配。在这种情况下,我只会说你正试图将名称与你名单中的文件名匹配[Commie] Psycho-Pass 2 - 01 [495A3950] .mkv。代码如下所示。你可以复制和粘贴,它会起作用。
/** list of titles from the database**/
$title_array = ["Psycho-Pass 2", "Fate Kaleid Liner Prisma Ilya 2wei", "The Big Bang Theory", "Modern Family"];
/** filename you want to match with the titles **/
$filename_raw = "[Commie] Psycho-Pass 2 - 01 [495A3950].mkv";
/**
* Clean the $filename
* Replace the dot and underscore with space, and remember to escape the characters, because they are special
* Here we just have a variable holding the pattern we need to replace and the replacement
**/
$patterns = array ('/\./','/\_/');
$replace = array (' ', ' ');
/**
* this is were replacement occurs
**/
$filename_clean = preg_replace($patterns, $replace, $filename_raw);
foreach($title_array as $title){
if (strpos($filename_clean,$title) !== false) {
echo "Match <br />";
/**
* you might want to put a break here since your have already found the match but I will leave that up to you
*/
}else{
echo "Match Not found<br />";
}
}