我在使用preg_match_all()
在正则表达式中提取链接时出现问题。
我有以下字符串:
some random text <a href="http://localhost/example/wp-content/uploads/2014/07/Link1.pdf\">Link1</a><a href="http://localhost/example/wp-content/uploads/2014/07/Link2.pdf\">Link2</a>
我想将文件和文件格式的链接提取为两个单独的变量。
这里有正则表达式大师吗?我一整天都在苦苦挣扎。
谢谢!
答案 0 :(得分:1)
(?<=href=")(.*?\.(.*?))\\
试试这个。只需抓住captures.see演示。
http://regex101.com/r/lS5tT3/80
$data = 'some random text <a href="http://localhost/example/wp-content/uploads/2014/07/Link1.pdf\">Link1</a><a href="http://localhost/example/wp-content/uploads/2014/07/Link2.pdf\">Link2</a>"';
$regex = '/(?<=href=")(.*?\.(.*?))\\\\/';
preg_match_all($regex, $data, $matches);
print_r($matches);
输出:
Array
(
[0] => Array
(
[0] => http://localhost/example/wp-content/uploads/2014/07/Link1.pdf\
[1] => http://localhost/example/wp-content/uploads/2014/07/Link2.pdf\
)
[1] => Array
(
[0] => http://localhost/example/wp-content/uploads/2014/07/Link1.pdf
[1] => http://localhost/example/wp-content/uploads/2014/07/Link2.pdf
)
[2] => Array
(
[0] => pdf
[1] => pdf
)
)