我有一组具有特定结构的文件:
COMPANY_DE-实际 - 内容 - 的 - 文件-RGB-ENG.pdf
故障:
在最好的情况下,我的结果将是一个带有命名键的上述信息的数组,但不知道从哪里开始。
非常感谢帮助!
谢谢, Knal
很抱歉非常清楚,但文件名中始终存在一些 变量: - DE - >固定选项:'_ DE','_ BE'或缺席 - RGB - > Colormode,固定选项:'RGB','CMYK','PMS'或缺席 - ENG - >文件语言,固定选项:'GER','ENG'或缺席
答案 0 :(得分:1)
尝试
$string = "COMPANY_DE-Actual-Contents-of-File-RGB-ENG.pdf";
$array = preg_split('/[-_\.]/', $string);
$len = count($array);
$struct = array($array[0], $array[1], '', $array[$len-3], $array[$len-2], $array[$len-1]);
unset($array[0], $array[1], $array[$len-3], $array[$len-2], $array[$len-1]);
$struct[2] = implode('-', $array);
var_dump($struct);
-
array
0 => string 'COMPANY' (length=7)
1 => string 'DE' (length=2)
2 => string 'Actual-Contents-of-File' (length=23)
3 => string 'RGB' (length=3)
4 => string 'ENG' (length=3)
5 => string 'pdf' (length=3)
答案 1 :(得分:1)
如果可能的话,尽量不要使用正则表达式,或者保持它们尽可能简单。
$text = "COMPANY_DE-Actual-Contents-of-File-RGB-ENG.pdf";
$options_location = array('DE','BE');
$options_color = array('RGB','CMYK','PMS');
$options_language = array('ENG','GER');
//Does it have multiple such lines? In that case this:
$lines = explode("\n",$text);
//Then loop over this with a foreach, doing the following for each line:
$parts = preg_split('/[-_\.]/', $line);
$data = array(); //result array
$data['company'] = array_shift($parts); //The first element is always the company
$data['filetype'] = array_pop($parts); //The last bit is always the file type
foreach($parts as $part) { //we'll have to test each of the remaining ones for what it is
if(in_array($part,$options_location))
$data['location'] = $part;
elseif(in_array($part,$options_color))
$data['color'] = $part;
elseif(in_array($part,$options_language))
$data['lang'] = $part;
else
$data['content'] = isset($data['content']) ? $data['content'].' '.$part : $part; //Wasn't any of the others so attach it to the content
}
这也更容易理解,而不必弄清楚正则表达式到底在做什么。
请注意,这假定内容的任何部分都不能是为位置,颜色或语言保留的单词之一。如果在内容中可能出现这些情况,则必须添加isset($data['location'])
之类的条件以检查是否已找到其他位置,如果是,则将正确的位置添加到内容而不是将其存储为位置
答案 2 :(得分:0)
类似的东西:
preg_match('#^([^_]+)(_[^-]+)?-([\w-]+)-(\w+)-(\w+)(\.\w+)$#i', 'COMPANY_DE-Actual-Contents-of-File-RGB-ENG.pdf', $m);
preg_match('#^([^_]+)(_[^-]+)?-([\w-]+)-(\w+)[_-]([^_]+)(\.\w+)$#i', 'COMPANY_DE-Actual-Contents-of-File-RGB-ENG.pdf', $m); // for both '_' and '-'
preg_match('#^(\p{Lu}+)(-\p{Lu}+)?-([\w]+)(\-(\p{Lu}+))?(\-(\p{Lu}+))?(\.\w+)$#', 'COMPANY-NL-Actual_Contents_of_File-RGB-ENG.pdf', $m); // if filename parts divider is strictly '-'
var_dump($m);
在最后一个变体中,我们询问是否没有国家代码(-NL)它将为NULL。但是使用颜色和语言代码并非如此。亲自试试吧,你会弄清楚它是如何运作的!
答案 3 :(得分:0)
怎么样:
$files = array(
'COMPANY_DE-Actual-Contents-of-File-RGB-ENG.pdf',
'COMPANY_BE-Actual-Contents-of-File-CMYK-ENG.pdf',
'COMPANY_DE-Actual-Contents-of-File-PMS-GER.doc',
'COMPANY-Actual-Contents-of-File-PMS-GER.doc',
'COMPANY-Actual-Contents-of-File-GER.doc',
'COMPANY-Actual-Contents-of-File.doc',
);
foreach($files as $file) {
preg_match('/^(?<COMPANY>.*?)_?(?<LOCATION>DE|BE)?-(?<CONTENT>.*?)-?(?<COLOR>RGB|CMYK|PMS)?-?(?<LANG>ENG|GER)?\.(?<EXT>[^.]+)$/', $file, $m);
echo "\nfile=$file\n";
echo "COMPANY: ",$m['COMPANY'],"\n";
echo "LOCATION: ",$m['LOCATION'],"\n";
echo "CONTENT: ",$m['CONTENT'],"\n";
echo "COLOR: ",$m['COLOR'],"\n";
echo "LANG: ",$m['LANG'],"\n";
echo "EXT: ",$m['EXT'],"\n";
}
<强>输出:强>
file=COMPANY_DE-Actual-Contents-of-File-RGB-ENG.pdf
COMPANY: COMPANY
LOCATION: DE
CONTENT: Actual-Contents-of-File
COLOR: RGB
LANG: ENG
EXT: pdf
file=COMPANY_BE-Actual-Contents-of-File-CMYK-ENG.pdf
COMPANY: COMPANY
LOCATION: BE
CONTENT: Actual-Contents-of-File
COLOR: CMYK
LANG: ENG
EXT: pdf
file=COMPANY_DE-Actual-Contents-of-File-PMS-GER.doc
COMPANY: COMPANY
LOCATION: DE
CONTENT: Actual-Contents-of-File
COLOR: PMS
LANG: GER
EXT: doc
file=COMPANY-Actual-Contents-of-File-PMS-GER.doc
COMPANY: COMPANY
LOCATION:
CONTENT: Actual-Contents-of-File
COLOR: PMS
LANG: GER
EXT: doc
file=COMPANY-Actual-Contents-of-File-GER.doc
COMPANY: COMPANY
LOCATION:
CONTENT: Actual-Contents-of-File
COLOR:
LANG: GER
EXT: doc
file=COMPANY-Actual-Contents-of-File.doc
COMPANY: COMPANY
LOCATION:
CONTENT: Actual-Contents-of-File
COLOR:
LANG:
EXT: doc
答案 4 :(得分:0)
受到@Armatus的启发,我构建了以下似乎是故障安全的:
$string = "COMPANY_DE-Actual-Contents+of-File-RGB-ENG.pdf";
$options_location = array('DE','BE');
$options_color = array('RGB','CMYK','PMS');
$options_language = array('ENG','GER');
$parts = preg_split( '/[\.\-\_]/', $string, NULL, PREG_SPLIT_NO_EMPTY );
$data = array();
$data['company'] = array_shift($parts);
$data['filetype'] = array_pop($parts);
if( in_array( $parts[0], $options_location ) ){
$data['location'] = array_shift($parts);
}else{
$data['location'] = NULL;
};
if( in_array( end( $parts), $options_language ) ){
$data['language'] = array_pop($parts);
}else{
$data['language'] = NULL;
};
if( in_array( end( $parts), $options_color ) ){
$data['colormode'] = array_pop($parts);
}else{
$data['colormode'] = NULL;
};
$data['content'] = implode( ' ', $parts );
print_r( $data );