所以,我试图从文本文件中捕获3个字母的单词。我创建了一个RegEx,但它返回一个EMPTY数组。我无法弄明白为什么! 这是文本文件的一部分。
================================================
Header of File with time and date
================================================
Loaded options from XML file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/ID2PDF_options.xml
extendedPrintPDF started
Postfix '3.0' was append from file 'ESQ030112ELAM_lo-metadata.xml' for file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.indd
printPDF started
PDF Export Preset: Some preset
PDF file created: ''/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.pdf'.
File someFileName.xml removed
postprocessingDocument started
INDD file removed: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.indd
这是我的RegEx:
/^Loaded options from XML file: '\/.*\/SM_Folder\/([a-zA-Z]{3})\/[a-zA-Z]+_Proof_\1\/processing\/ID2PDF_options.xml$/im
如果我在\
前删除\/([a-zA-Z]{3})
,则会收到Unknown modifier:(
错误。
有人可以告诉我,从记录的第一行获取“ESQ”需要做些什么吗? 3个字母的单词在其他记录中会有所不同,因此,我无法真正设计我的RegEx以仅捕获ESQ。例如,它可能是ABC或XYZ。但是,它仍然是一个3个字母的单词。 任何有用的输入将不胜感激。
此外,这篇文章也没有多大帮助:PHP Regex returning array with values empty
注意:
options.xml
不以'
结尾,因为它不应该!
答案 0 :(得分:1)
[a-zA-Z]_Proof
应该是
[a-zA-Z]+_Proof
答案 1 :(得分:1)
正则表达式模式和您在问题中提供的文件数据不会产生空数组。至少不适合我(稍后再读)。使用preg_match_all
,我会正确地获得一场比赛。我使用这段代码:
$file = <<<FILE
================================================
Header of File with time and date
================================================
Loaded options from XML file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/ID2PDF_options.xml
extendedPrintPDF started
Postfix '3.0' was append from file 'ESQ030112ELAM_lo-metadata.xml' for file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.indd
printPDF started
PDF Export Preset: Some preset
PDF file created: ''/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.pdf'.
File someFileName.xml removed
postprocessingDocument started
INDD file removed: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/someFile.indd
FILE;
$pattern = '/^Loaded options from XML file: \'\/.*\/SM_Folder\/([a-zA-Z]{3})\/[a-zA-Z]+_Proof_\1\/processing\/ID2PDF_options.xml$/im';
$result = preg_match_all($pattern, $file, $matches);
var_dump($result, $matches);
结果:
int(1)
array(2) {
[0] =>
array(1) {
[0] =>
string(127) "Loaded options from XML file: '/Thisis/some/Users/sumuser/Desktop/SM_Folder/ESQ/Virtual_Proof_ESQ/processing/ID2PDF_options.xml"
}
[1] =>
array(1) {
[0] =>
string(3) "ESQ"
}
}
您可能会得到与以下相似的结果(也与上面的代码完全相同,但在demo here的另一台计算机上):
int(0)
array(2) {
[0]=>
array(0) {
}
[1]=>
array(0) {
}
}
如果您收到此结果,则表明多行模式中的^
和$
与行尾不匹配,因为您没有\n
但是最有可能是CRLF序列(DOS / Windows行结束)。您可以使用ANYCRLF
选项获取所有这些序列:
$pattern = '/(*ANYCRLF)^Loaded options from XML file: \'\/.*\/SM_Folder\/([a-zA-Z]{3})\/[a-zA-Z]+_Proof_\1\/processing\/ID2PDF_options.xml$/im';
^^^^^^^^^^
然后应该给你结果。请参阅the working demo。
答案 2 :(得分:0)
\/([a-zA-Z]{3})
不是有效的正则表达式。你错过了分隔符。
preg_match_all(":\/([a-zA-Z]{3}):", $input, $matches);
你可以选择任何角色,我选择:
。