Question

我正在寻找优雅的正则表达式来清理括号，内容看起来像文件名。

[Nibh justo] elit Nulla [link.pdf]  auctor ipsum molestie (link.pdf) 
Condimentum euismod non [link.xls](link.xls) [link.doc](link.doc) tempus 
In [Curabitur] et

结果应为：

Nibh justo elit Nulla auctor ipsum molestie Condimentum euismod 
non tempus In Curabitur et

我相信必须有一个简短的方法。（文件意味着 - 包括点。不需要进行句子检查。）

感谢您的帮助

Answer 1

这样的东西？

$str = '[Nibh justo] elit Nulla [link.pdf]  auctor ipsum molestie (link.pdf) 
Condimentum euismod non [link.xls](link.xls) [link.doc](link.doc) tempus In 
[Curabitur] and other [./beta/link.pfd]';

$str = preg_replace('`(\(|\[)[\w/\.-]+\.[a-z]+(\)|\])`i', '', $str);
$str = str_replace(array('[', ']'), '', $str);

echo $str;

结果是：

Nibh justo elit Nulla auctor ipsum molestie Condimentum euismod 
non tempus In Curabitur and other

Answer 2

试试这个：

(?:[\[$]\w+\.\w+[\]$])|(?:[\[$](?=[0-9A-Za-z]))|(?:(?<=[0-9A-Za-z])[\]$])

$result = preg_replace('/(?:[[(]\w+\.\w+[\])])|(?:[[(](?=[0-9A-Za-z]))|(?:(?<=[0-9A-Za-z])[\])])/m', '', $subject);

<强>解释

    <!--
(?:[\[\(]\w+\.\w+[\]\)])|(?:[\[\(](?=[0-9A-Za-z]))|(?:(?<=[0-9A-Za-z])[\]\)])

Options: ^ and $ match at line breaks

Match either the regular expression below (attempting the next alternative only if this one fails) «(?:[\[\(]\w+\.\w+[\]\)])»
   Match the regular expression below «(?:[\[\(]\w+\.\w+[\]\)])»
      Match a single character present in the list below «[\[\(]»
         A [ character «\[»
         A ( character «\(»
      Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      Match the character “.” literally «\.»
      Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      Match a single character present in the list below «[\]\)]»
         A ] character «\]»
         A ) character «\)»
Or match regular expression number 2 below (attempting the next alternative only if this one fails) «(?:[\[\(](?=[0-9A-Za-z]))»
   Match the regular expression below «(?:[\[\(](?=[0-9A-Za-z]))»
      Match a single character present in the list below «[\[\(]»
         A [ character «\[»
         A ( character «\(»
      Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=[0-9A-Za-z])»
         Match a single character present in the list below «[0-9A-Za-z]»
            A character in the range between “0” and “9” «0-9»
            A character in the range between “A” and “Z” «A-Z»
            A character in the range between “a” and “z” «a-z»
Or match regular expression number 3 below (the entire match attempt fails if this one fails to match) «(?:(?<=[0-9A-Za-z])[\]\)])»
   Match the regular expression below «(?:(?<=[0-9A-Za-z])[\]\)])»
      Assert that the regex below can be matched, with the match ending at this position (positive lookbehind) «(?<=[0-9A-Za-z])»
         Match a single character present in the list below «[0-9A-Za-z]»
            A character in the range between “0” and “9” «0-9»
            A character in the range between “A” and “Z” «A-Z»
            A character in the range between “a” and “z” «a-z»
      Match a single character present in the list below «[\]\)]»
         A ] character «\]»
         A ) character «\)»
-->

当上述RegEx适用于：

时，

[Nibh justo] elit Nulla [link.pdf]  auctor ipsum molestie (link.pdf) 
Condimentum euismod non [link.xls](link.xls) [link.doc](link.doc) tempus 
In [Curabitur] et

产生所需的结果：

Nibh justo elit Nulla auctor ipsum molestie Condimentum euismod non tempus In Curabitur et

php regex - 清理文件名

2 个答案: