我有不同版本的文件名。
如何从中获取123.pdf
,124.pdf
和125.pdf
?
文件名的长度可能会有所不同,14-5678
与此时间无关,应予以忽略。
14-5678_jobname_0123_.p1.PDF
14-5678_jobname_0123_.p2.PDF
14-5678_jobname_0125_.p1.PDF
Weired_filename_0123_bla_14-5678_jobname.p1.PDF
Weired_filename_0123_bla_14-5678_jobname.p2.PDF
Weired_filename_0125_bla_14-5678_jobname.p1.PDF
14-5678_jobname_0123.p1.PDF
14-5678_jobname_0123.p2.PDF
14-5678_jobname_0125.p1.PDF
0123_14-5678_jobname.p1.PDF
0123_14-5678_jobname.p2.PDF
0125_14-5678_jobname.p1.PDF
jobname_0123_14-5678.p1.PDF
jobname_0123_14-5678.p2.PDF
jobname_0125_14-5678.p1.PDF
与正则表达式测试人员一起试了几个小时,我现在完全卡住了。会喜欢一些可以完成这项工作的PHP代码。
答案 0 :(得分:0)
您需要匹配一系列前面没有破折号的四个数字:
/[^-](\d{4})/
分解正则表达式:
[^-]
:不是破折号\d{4}
:四位数(\d{4})
:捕获数字然后,您可以添加.pdf
来获取文件名。
preg_replace
示例以及您在数组中给出的文件名:
foreach ($files as $f) {
echo "$f => " . preg_replace("/.*?[^-]*(\d{4}).+/", "$1.pdf", $f) . PHP_EOL;
}
ETA:如果您想要考虑页码,可以使用以下代码:
foreach ($files as $f) {
# this saves the four digits of the PDF name, and the number in p1/p2
preg_match("/.*?[^-]*(\d{4}).*?p(\d+)\.pdf/i", $f, $matches);
# if the number (from p1/p2) is greater than 1, add it to the PDF name number
if ($matches[2] > 1) {
$matches[1] += $matches[2] - 1;
}
# format the pdf name to be four digits long, with zero padding for shorter names
echo "$f => " . sprintf('%04d.pdf', $matches[1]) . PHP_EOL;
}
输出:
14-5678_jobname_0123_.p1.PDF => 0123.pdf
14-5678_jobname_0123_.p2.PDF => 0124.pdf
14-5678_jobname_0125_.p1.PDF => 0125.pdf
Weired_filename_0123_bla_14-5678_jobname.p1.PDF => 0123.pdf
Weired_filename_0123_bla_14-5678_jobname.p2.PDF => 0124.pdf
Weired_filename_0125_bla_14-5678_jobname.p1.PDF => 0125.pdf
14-5678_jobname_0123.p1.PDF => 0123.pdf
14-5678_jobname_0123.p2.PDF => 0124.pdf
14-5678_jobname_0125.p1.PDF => 0125.pdf
0123_14-5678_jobname.p1.PDF => 0123.pdf
0123_14-5678_jobname.p2.PDF => 0124.pdf
0125_14-5678_jobname.p1.PDF => 0125.pdf
jobname_0123_14-5678.p1.PDF => 0123.pdf
jobname_0123_14-5678.p2.PDF => 0124.pdf
jobname_0125_14-5678.p1.PDF => 0125.pdf