正则表达式不是多行文字

时间:2015-10-15 14:33:00

标签: php regex

所以我正在开展一个项目,我正在建立一个交易卡游戏的价格指南。原谅这里的神经质水平。我正在从一个网站上提取数据

$data = mb_convert_encoding(file_get_contents("http://yugioh.wikia.com/api.php?action=query&prop=revisions&titles=Elemental%20HERO%20Shining%20Flare%20Wingman&rvprop=content&format=php"), "HTML-ENTITIES", "UTF-8");

然后我使用一系列Regex语句解析它。

preg_match_all('/(?<=\|\slore)\s+\=(.*)/', $data, $matches);
$text = $matches[1][0]; //it prints out here just fine
$text = preg_replace("/(\[\[(\w+|\s)*\|)/sx", "" , $text); //it disappears if I try to print it here
$text = preg_replace("/\[\[/", "" , $text);
$text = preg_replace("/\]\]/", "" , $text);

正如你在第二行我可以看到的那样,我抓住了匹配,如果我用print_r语句跟随它,它将打印文本。在下一行,如果我使用print语句跟随它,它将不会打印任何内容。因此,通过这种逻辑,这意味着正则表达式无法正确解析。那么我做错了什么呢?我认为它与多线有关,但我尝试了它并没有帮助。

修改

这是第一次拉动后的文字

 "[[Elemental HERO Flame Wingman]]" + "[[Elemental HERO Sparkman]]"
Must be [[Fusion Summon]]ed and cannot be [[Special Summon]]ed by other ways. This card gains 300 [[ATK]] for each "[[Elemental HERO]]" card in your [[Graveyard]]. When this card [[destroy]]s a [[Monster Card|monster]] [[Destroyed by Battle|by battle]] and [[send]]s it to the Graveyard: Inflict [[Effect Damage|damage]] to your opponent equal to the ATK of the destroyed monster in the Graveyard.

1 个答案:

答案 0 :(得分:2)

此正则表达式/(\[\[(\w+|\s)*\|)/sx包含嵌套量词:\w+量词一起使用,*应用于整个交替组。这会产生大量的回溯步骤,并产生catastrophic backtracking

此处避免此问题的最佳方法是通过字符类[\w\s]*(匹配0个或更多字母数字字符或空格符号)。

请参阅IDEONE demo

$s = "\"[[Elemental HERO Flame Wingman]]\" + \"[[Elemental HERO Sparkman]]\"\nMust be [[Fusion Summon]]ed and cannot be [[Special Summon]]ed by other ways. This card gains 300 [[ATK]] for each \"[[Elemental HERO]]\" card in your [[Graveyard]]. When this card [[destroy]]s a [[Monster Card|monster]] [[Destroyed by Battle|by battle]] and [[send]]s it to the Graveyard: Inflict [[Effect Damage|damage]] to your opponent equal to the ATK of the destroyed monster in the Graveyard.";
$s = preg_replace('/(\[\[([\w\s]*)\|)/', "" , $s);
echo $s;

另请注意,您不需要x修饰符(因为模式本身没有注释和无意义的空格)和s修饰符(因为在.修饰符中没有function wp_pagination() { global $wp_query; $total = $wp_query->max_num_pages; $prev_arrow = is_rtl() ? '<span class="next">Next</i>' : '<span class="previous">Prev</span >'; $next_arrow = is_rtl() ? '<span class="previous">Prev</i>' : '<span class="next">Next</span >'; $big = 999999999; if( $total > 1 ) { if( !$current_page = get_query_var('paged') ) $current_page = 1; if( get_option('permalink_structure') ) { $format = 'page/%#%/'; } else { $format = '&paged=%#%'; } echo paginate_links(array( 'base' => str_replace( $big, '%#%', esc_url( get_pagenum_link( $big ) ) ), 'format' => $format, 'current' => max( 1, get_query_var('paged') ), 'total' => $total, 'mid_size' => 3, 'type' => 'list', 'prev_text' => $prev_arrow, 'next_text' => $next_arrow, ) ); } } 图案)。