通过正则表达式PHP解析@ font-face

时间:2018-03-28 21:49:11

标签: php regex

我知道这听起来已经很糟糕了,但这是我被迫做一些我不喜欢的事情的任务之一,因此我正在寻求帮助。 请花一点时间来看一下。是的,我已经看过你可能建议的所有内容,如CSS解析器,堆栈的正则表达式片段,但这是特定情况。我讨厌它,但我必须完成它。

我需要解析CSS文件,获取所有@ font-face工具包,将它们放在一个数组中,该数组包含一个由font-familyfont-weight组成的键。

所需的格式是

    array
(
    [montserat400] => array
    (
        [font-family] => ''montserat''
        [src] => 'url('montserrat-regular-webfont.woff') format('woff')'
        [font-weight] => '400'
        [font-style] => 'normal'
    )
    [montserat500] => array
    (
        [font-family] => ''montserat''
        [src] => 'url('montserrat-medium-webfont.woff') format('woff')'
        [font-weight] => '500'
        [font-style] => 'normal'
    )
)

这是CSS

/*! Generated by Font Squirrel (https://www.fontsquirrel.com) on March 25, 2018 */
@font-face {
    font-family: 'montserat';
    src: url('montserrat-regular-webfont.woff') format('woff');
    font-weight: 400;
    font-style: normal;
}
@font-face {
    font-family: 'montserat';
    src: url('montserrat-medium-webfont.woff') format('woff');
    font-weight: 500;
    font-style: normal;
}
@font-face {
    font-family: 'montserat';
    src: url('montserrat-semibold-webfont.woff') format('woff');
    font-weight: 600;
    font-style: normal;
}
@font-face {
    font-family: 'montserat';
    src: url('montserrat-bold-webfont.woff') format('woff');
    font-weight: 700;
    font-style: normal;
}

@font-face {
    font-family: 'montserat';
    src: url('montserrat-extrabold-webfont.woff') format('woff');
    font-weight: 800;
    font-style: normal;
}

我尝试过的其他事情,这是目前的工作

$re = '/@font-face.*{\K[^}]*(?=})/';
preg_match_all($re, $css, $matches, PREG_SET_ORDER);

if($matches){

    $parsed = array();

    foreach($matches as $k => $ff ){

        $css    = $ff[0];
        $attrs  = explode(";", $css);

        foreach ($attrs as $attr) {
           if (strlen(trim($attr)) > 0) {
              $kv = explode(":", trim($attr));
              $parsed[$k][trim($kv[0])] = trim($kv[1]);
           }
        }

        unset( $attrs );            
    }
    print_r($parsed);

}

它给了我这个可用的,我可以做另一个循环并按我喜欢的方式设置键

array
(
    [0] => array
    (
        [font-family] => ''montserat''
        [src] => 'url('montserrat-regular-webfont.woff') format('woff')'
        [font-weight] => '400'
        [font-style] => 'normal'
    )
    [1] => array
    (
        [font-family] => ''montserat''
        [src] => 'url('montserrat-medium-webfont.woff') format('woff')'
        [font-weight] => '500'
        [font-style] => 'normal'
    )
    [2] => array
    (
        [font-family] => ''montserat''
        [src] => 'url('montserrat-semibold-webfont.woff') format('woff')'
        [font-weight] => '600'
        [font-style] => 'normal'
    )
    [3] => array
    (
        [font-family] => ''montserat''
        [src] => 'url('montserrat-bold-webfont.woff') format('woff')'
        [font-weight] => '700'
        [font-style] => 'normal'
    )
    [4] => array
    (
        [font-family] => ''montserat''
        [src] => 'url('montserrat-extrabold-webfont.woff') format('woff')'
        [font-weight] => '800'
        [font-style] => 'normal'
    )
)

但似乎我为此做了很多,并且使用具有相同数据的3个循环似乎是错误的。

2 个答案:

答案 0 :(得分:1)

我用两个捕获组做了一个用于键,另一个用于值。并使用一个foreach来分配键和值

$re = '/^(.*):\s*(.*)\s*;$/m';
$matches = array();

preg_match_all($re, $css, $matches, PREG_SET_ORDER);

$parsed = array();
$count = 0;

foreach ($matches as $key => $match) {
    if ($count === 0) {
        $tmp = array();
    }

    $tmp[trim($match[1])] = trim($match[2]);

    if ($count === 3) {
        $parsed[] = $tmp;
        $count = 0;
    } else {
        $count ++;
    }
}

echo "<pre>";
print_r($parsed);
echo "</pre>";

给我这个:

Array
(
    [0] => Array
        (
            [font-family] => 'montserat'
            [src] => url('montserrat-regular-webfont.woff') format('woff')
            [font-weight] => 400
            [font-style] => normal
        )

    [1] => Array
        (
            [font-family] => 'montserat'
            [src] => url('montserrat-medium-webfont.woff') format('woff')
            [font-weight] => 500
            [font-style] => normal
        )

    [2] => Array
        (
            [font-family] => 'montserat'
            [src] => url('montserrat-semibold-webfont.woff') format('woff')
            [font-weight] => 600
            [font-style] => normal
        )

    [3] => Array
        (
            [font-family] => 'montserat'
            [src] => url('montserrat-bold-webfont.woff') format('woff')
            [font-weight] => 700
            [font-style] => normal
        )

    [4] => Array
        (
            [font-family] => 'montserat'
            [src] => url('montserrat-extrabold-webfont.woff') format('woff')
            [font-weight] => 800
            [font-style] => normal
        )

)

答案 1 :(得分:1)

TL; DR

$matches = preg_split('~(?>@font-face\s*{\s*|\G(?!\A))(\S+)\s*:\s*([^;]+);\s*~', $text,
    -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
# index counter
$i = 0;
$output = [];
# PHP 7 doesn't change internal pointer, hence passing by-ref
foreach($matches as $key => &$match) {
    # Check if we're reaching end of block
    if (strpos($match, "}") !== 0) {
        # Storing current value as key, next as value
        $output[$i][$match] = $matches[$key + 1];
        # Skip over next value off iteration
        unset($matches[$key + 1]);
        continue;
    }
    # Increment index counter
    $i++;
}

print_r($output);

<强> PHP live demo

使用\G令牌,您可以立即浏览所有属性及其值。您可以进行匹配或拆分。我更喜欢后者:

(?>@font-face\s*{\s*|\G(?!\A))(\S+)\s*:\s*([^;]+);\s*

RegEx live demo

击穿

  • (?>开始非捕获组
    • @font-face\s*{\s*匹配字体 - 面部块
    • |
    • \G(?!\A)从最后一场比赛结束的地方继续
  • ) NCG结束
  • (\S+)匹配并捕获属性名称
  • \s*:\s*匹配冒号
  • ([^;]+)匹配和捕获值
  • ;\s*匹配a;和尾随空格(如果有的话)