使用正则表达式提取自定义标记的属性值

时间:2015-05-08 03:18:08

标签: php regex

谢谢你看看这个。我正在使用PHP。我有一个像这样的字符串:

[QUOTE="name: Max-Fischer, post: 486662533, member: 123"]I don't so much dance as rhythmically convulse.[/QUOTE]

我想提取引号中的值并创建一个关联数组,如下所示:

["name" => "Max-Fischer", "post" => "486662533", "member" => "123"]

然后,我想删除打开和关闭[QUOTE]标签并将其替换为自定义HTML,如下所示:

<blockquote><a href="URL_I_WILL_GENERATE_FROM_THE_ARRAY_VALUES">Max-Fischer</a> wrote: I don't so much dance as rhythmically convulse.</blockquote>

所以主要的问题是首先要创建preg_match()或preg_replace()来处理:在数组中获取值,第二个:删除标记并用我的自定义内容替换它们。我可以弄清楚如何使用数组来创建自定义HTML,我只是无法想象如何使用正则表达式来实现它。

我尝试了这样的匹配来获取属性值:

/(\S+)=[\"\']?((?:.(?![\"\']?\s+(?:\S+)=|[>\"\']))+.)[\"\']?/

但这只会返回:

[QUOTE

这甚至没有解决如何将值(如果我可以得到它们)放入数组中。

提前感谢您的时间。

干杯。

2 个答案:

答案 0 :(得分:2)

如果您正在寻找的标签总是引用,那么可能会更简单一些:

  $s ='"[QUOTE="name: Max-Fischer, post: 486662533, member: 123"]I don\'t so much dance as rhythmically convulse.[/QUOTE]';

  $r = '/\[QUOTE="(.*?)"\](.*)\[\/QUOTE\]/';  

  $m = array();
  $arr = array();
  preg_match($r, $s, $m);
  // m[0] = the initial string
  // m[1] = the string of attributes
  // m[2] = the quote itself
  foreach(explode(',', $m[1]) as $valuepair) { // split the attributes on the comma
    preg_match('/\s*(.*): (.*)/', $valuepair, $mm);
    // mm[0] = the attribute pairing
    // mm[1] = the attribute name
    // mm[2] = the attribute value
    $arr[$mm[1]] = $mm[2];
  }
  print_r($arr);
  print $m[2] . "\n";

这给出了以下输出:

Array
(
    [name] => Max-Fischer
    [post] => 486662533
    [member] => 123
)
I don't so much dance as rhythmically convulse.

如果你想处理字符串中有多个引号的情况,我们可以通过将正则表达式修改为稍微不那么贪心,然后使用preg_match_all而不是{{1}来实现这一点。 }

preg_match

这给出了输出:

  $s ='[QUOTE="name: Max-Fischer, post: 486662533, member: 123"]I don\'t so much dance as rhythmically convulse.[/QUOTE]';
  $s .='[QUOTE="name: Some-Guy, post: 486562533, member: 1234"]Quidquid latine dictum sit, altum videtur[/QUOTE]';

  $r = '/\[QUOTE="(.*?)"\](.*?)\[\/QUOTE\]/';
  //                         ^  <--- added to make it less greedy
  $m = array();
  $arr = array();
  preg_match_all($r, $s, $m, PREG_SET_ORDER);
  // m[0] = the first quote
  // m[1] = the second quote
  // m[0][0] = the initial string
  // m[0][1] = the string of attributes
  // m[0][2] = the quote itself
  // element for each quote found in the string
  foreach($m as $match) { // since there is more than quote, we loop and operate on them individually
    $quote = array();
    foreach(explode(',', $match[1]) as $valuepair) { // split the attributes on the comma
      preg_match('/\s*(.*): (.*)/', $valuepair, $mm);
      // mm[0] = the attribute pairing
      // mm[1] = the attribute name
      // mm[2] = the attribute value
      $quote[$mm[1]] = $mm[2];
    }
    $arr[] = $quote; // we now build a parent array, to hold each individual quote
  }
  print_r($arr);

答案 1 :(得分:1)

我设法解决了你的问题:获取关联数组。我希望它会对你有所帮助。

这是代码

$str =  <<< PP
[QUOTE=" name : Max-Fischer,post : 486662533,member : 123 "]I don't so much dance as rhythmically convulse.[/QUOTE]
PP;

preg_match_all('/^\[QUOTE=\"(.*?)\"\](?:.*?)]$/', $str, $matches);
preg_match_all('/([a-zA-Z0-9]+)\s+:\s+([a-zA-Z0-9]+)/', $matches[1][0], $result);

$your_data = array_combine($result[1],$result[2]);

echo "<pre>";
print_r($your_data);