我正在尝试使用php正则表达式从一个字符串中提取多个部分/条件...让我告诉你我在说什么;这是总文件内容的摘录(真实内容包含数百个这样的分组):
part "C28"
{ type : "1AB010050093",
%cadtype : "1AB010050094",
shapeid : "2_1206",
descr : "4700.0000 pFarad 10.00 % 100.0 - VE5-VS3",
insclass : "CP6A,CP6B",
gentype : "RECT_032_016_006",
machine : "SMT",
%package : "080450E",
%_item_number: "508",
%_Term_Seq : "" }
part "C29"
{ type : "1AB008140029",
shapeid : "2_1206",
descr : "150.0000 pFarad 5.00 % 100.0 Volt NP0 CERAMIC CAPACITOR",
insclass : "CP6A,CP6B",
gentype : "RECT_032_016_006",
machine : "SMT",
%package : "080450E",
%_item_number: "3",
%_Term_Seq : "" }
如您所见,摘录中的数据重复两次。我需要搜索整个文件并提取以下内容:
所以,基本上,我需要从这个文件中获取所有部件引用和相关类型......而且我不确定这样做的最佳方法。
如果需要更多信息,请告知我们...提前致谢!
答案 0 :(得分:11)
此表达式将:
ref
type
和descr
字段的值。 partnumber
descr
字段是可选字段,只有在存在时才应捕获。 (?:
... )?`` brackets around the
descr`字段使字段成为可选字段请注意,这是一个单独的表达式,因此您将使用x
选项,以便正则表达式引擎忽略空格。
^part\s"(?P<ref>[^"]*)"[^{]*{
(?:(?=[^}]*\sdescr\s*:\s+"(?P<descr>[^"]*)"))?
(?=[^}]*\stype\s*:\s+"(?P<type>[^"]*)")
输入文字
part "C28"
{ type : "1AB010050093",
%cadtype : "1AB010050094",
shapeid : "2_1206",
descr : "4700.0000 pFarad 10.00 % 100.0 - VE5-VS3",
insclass : "CP6A,CP6B",
gentype : "RECT_032_016_006",
machine : "SMT",
%package : "080450E",
%_item_number: "508",
%_Term_Seq : "" }
part "C29"
{ type : "1AB008140029",
shapeid : "2_1206",
descr : "150.0000 pFarad 5.00 % 100.0 Volt NP0 CERAMIC CAPACITOR",
insclass : "CP6A,CP6B",
gentype : "RECT_032_016_006",
machine : "SMT",
%package : "080450E",
%_item_number: "3",
%_Term_Seq : "" }
part "C30"
{ type : "1AB0081400 30",
shapeid : "2_1206 30",
insclass : "CP6A,CP6B 30",
gentype : "RECT_032_016_006 30",
machine : "SMT 30",
%package : "080450E 30 ",
%_item_number: "3 30 ",
%_Term_Seq : "30" }
<强>代码强>
<?php
$sourcestring="your source string";
preg_match_all('/^part\s"(?P<ref>[^"]*)"[^{]*{
(?:(?=[^}]*\sdescr\s*:\s+"(?P<descr>[^"]*)"))?
(?=[^}]*\stype\s*:\s+"(?P<partnumber>[^"]*)")/imsx',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
<强>匹配强>
$matches Array:
(
[ref] => Array
(
[0] => C28
[1] => C29
[2] => C30
)
[descr] => Array
(
[0] => 4700.0000 pFarad 10.00 % 100.0 - VE5-VS3
[1] => 150.0000 pFarad 5.00 % 100.0 Volt NP0 CERAMIC CAPACITOR
[2] =>
)
[partnumber] => Array
(
[0] => 1AB010050093
[1] => 1AB008140029
[2] => 1AB0081400 30
)
)
答案 1 :(得分:2)
假设每个组具有相同的结构,您可以使用此模式:
preg_match_all('~([^"]++)"[^{"]++[^"]++"([^"]++)~', $subject, $matches);
print_r($matches);
编辑:
注意:如果您有更多信息要提取,您可以轻松地将数据转换为json,例如:
$data = <<<LOD
part "C28"
{ type : "1AB010050093",
%cadtype : "1AB010050094",
shapeid : "2_1206",
descr : "4700.0000 pFarad 10.00 % 100.0 - VE5-VS3",
insclass : "CP6A,CP6B",
gentype : "RECT_032_016_006",
machine : "SMT",
%package : "080450E",
%_item_number: "508",
%_Term_Seq : "" }
part "C29"
{ type : "1AB008140029",
shapeid : "2_1206",
descr : "150.0000 pFarad 5.00 % 100.0 Volt NP0 CERAMIC CAPACITOR",
insclass : "CP6A,CP6B",
gentype : "RECT_032_016_006",
machine : "SMT",
%package : "080450E",
%_item_number: "3",
%_Term_Seq : "" }
LOD;
$trans = array( "}\n" => '}, ' , 'part' => '' ,
"\"\n{" => ':{"' , ':' => '":' ,
"\",\n" => '","' );
$data = str_replace(array_keys($trans), $trans, $data);
$data = preg_replace('~\s*+"\s*+~', '"', $data);
$json_data =json_decode('{"'.substr($data,1).'}');
foreach ($json_data as $key=>$value) {
echo '<br/><br/>part: ' . $key . '<br/>type: ' . $value->type;
}