PHPWord - 从word文档中删除所有搜索模式

时间:2015-11-11 23:20:56

标签: preg-match-all docx phpword

我正在使用PHPWord来读取我的docx模板,并在处理结束时从模板中删除所有搜索模式。 docx模板中的搜索模式是

${SOMETAG}
text block
${/SOMETAG}

docx后面的xml有一个看起来像这样的结构:

<w:p w:rsidR="00E21534" w:rsidRDefault="00E21534" w:rsidP="00E21534"><w:r><w:t>${SOMETAG}</w:t></w:r></w:p><w:p w:rsidR="00E21534" w:rsidRDefault="00E21534" w:rsidP="00E21534"><w:r><w:t>text block</w:t></w:r></w:p><w:p w:rsidR="00E21534" w:rsidRDefault="00E21534" w:rsidP="00E21534"><w:r><w:t>${/SOMETAG}</w:t></w:r></w:p>

这是我编写的用于删除开始标记的函数,即$ {SOMETAG}但它似乎无法找到标记。我认为问题是我的preg_match_all中的模式。你能告诉我这里我做错了什么吗?

public function removeSearchPatterns()
{
    //search for ${*}
    preg_match_all(
        '/(${.*})/is',
        $this->tempDocumentMainPart,
        $matches,
        PREG_SET_ORDER
    );

    //remove ${*}
    foreach ($matches as $match){
        if (isset($match[0])) {
            $this->tempDocumentMainPart = str_replace(
                $match[0],
                '',
                $this->tempDocumentMainPart
            );
        }
    }

}

2 个答案:

答案 0 :(得分:0)

我做到了!虽然欢迎提出更有效的建议。

public function removeSearchPatterns()
{
    //search for ${*} and ${/*} 
    preg_match_all(
        '/(\${)(\b[^}]*)(})(.*?)(\${.)(\2)(})/',
        $this->temporaryDocumentMainPart,
        $matches,
        PREG_SET_ORDER
    );
    //if ${*} and ${/*} are found, do the removal
    if (!empty($matches)){
        //remove ${*} and ${/*}
        foreach ($matches as $match){               
            if (isset($match[0])) {
                $this->temporaryDocumentMainPart = str_replace(
                    $match[0],
                    $match[4],
                    $this->temporaryDocumentMainPart
                );
            }
        }

        //search for <w:p> to </w:p>
        preg_match_all(
            '/(<w:p[^>]*><w:r[^>]*><w:t><\/w:t><\/w:r>)(?:<[^<>]+><[^<>]+>)?(<\/w:p>)/',
            $this->temporaryDocumentMainPart,
            $matches,
            PREG_SET_ORDER
        );

        //remove <w:p> to </w:p>
        foreach ($matches as $match){
            if (isset($match[0])) {             
                $this->temporaryDocumentMainPart = str_replace(
                    $match[0],
                    '',
                    $this->temporaryDocumentMainPart
                );
            }
        }
    }
}

答案 1 :(得分:0)

我已根据您的代码制作了一种方法,但我简化为仅删除$ {}和$ {/ }。顺便说一下,感谢上面的代码。

public function removeSearchPatterns()
{
    //search for ${*} and ${/*}
    preg_match_all(
        '/(\$\{[^\}]*\})/',
        $this->temporaryDocumentMainPart,
        $matches,
        PREG_SET_ORDER
    );

    //remove ${*} and ${/*}
    foreach ($matches as $match){
        $this->temporaryDocumentMainPart = str_replace(
            $match,
            '',
            $this->temporaryDocumentMainPart
        );
    }
}