Question

假设我们的$plain_css变量中有一些CSS：

.slide-pause {
  cursor: url(http://example.com/img/bg/pause.png),url(http://example.com/img/bg/pause.png),auto;
}
.something {
  background-image: url('http://example.com/img/bg/beautiful.png'); // We have Quotes here
}

我需要从这个CSS中获取所有网址。

这就是我想要实现的目标：

preg_match_all('!url\(\'?http://example.com/.*\)!', $plain_css, $matches);

$matches返回的内容：

array
  0 => 
  array
    0 => string 'url(http://example.com/img/bg/pause.png),url(http://localhost/site/img/bg/pause.png)'
    1 => string 'url(http://example.com/img/bg/beautiful.png)'

我需要它返回：

array
  0 => string 'url(http://example.com/img/bg/pause.png)'
  1 => string 'url(http://example.com/img/bg/pause.png)'
  2 => string 'url(http://example.com/img/bg/beautiful.png)'

Answer 1

你是greediness的受害者。 .*尽可能匹配。将其替换为.*?以使其不适合快速修复。或者从重复的字符中禁用)（这通常是首选 - 它更明确，更有效）：

preg_match_all('!url\(\'?http://example.com/[^)]*)!', $plain_css, $matches);

请注意，您无法说服preg_match_all返回普通数组中的所有内容 - 您将始终获得嵌套数组（这对capturing很重要）。但您只需从$matches[0]获得所需的结果。

Answer 2

你需要使你的重复量词变得懒惰（默认是贪婪）：

preg_match_all('!url\(\'?http://example.com/.*?\)!', $plain_css, $matches);

这里唯一的变化是我在*重复量词之后添加了一个问号。通常，重复是 greedy ：也就是说，它们匹配尽可能多的字符（并且仍然满足表达式）。在这种情况下，*量词的贪婪消耗了输入字符串中的两个url表达式。更改为延迟量词可以解决问题。

另一种处理方法是使用否定字符类而不是.元字符（匹配除换行符之外的任何字符）：

preg_match_all('!url\(\'?http://example.com/[^)]*\)!', $plain_css, $matches);

从纯CSS获取所有URL

2 个答案: