PHP - 从不以特定单词开头的字符串中删除单词(http | https | www | .com | .net)

时间:2015-04-27 01:45:54

标签: php

我有一个包含一些文字和一些网址的字符串。我的目标是从字符串中删除以下内容:

$ removeThis = array('http://','https://','www。','。com','。net');

但仅 要删除的字词无法启动http://good.comhttp://www.good.comhttps://good.comhttps://www.good.com

换句话说,我想从字符串中删除http | s | www。| .com | .net部分(但前提是它们不属于good.com域)。

INPUT:

$string='Hello world, this is spamming: www.spam.com, spam.net, https://spam.com, https://spam.com/tester. And this is not spam so do not touch it: http://www.good.com/okay, http://good.com, and also https://good.com/well';

结果应该是:

Hello world, this is spamming: spam, spam, spam, spam/tester. And this is not spam so do not touch it: http://www.good.com/okay, http://good.com, and also https://good.com/well

我认为这里需要preg_replace ..

3 个答案:

答案 0 :(得分:1)

尝试以下:

  $preg = '/(?:(http|https):\/\/)?(?:www\.)?\w+\.(com|net)/i';

$str = preg_replace_callback($preg, function($matches) {
    $removeThis = array('/http:\/\//i', 'https://', 'www.', '.com', '.net');
    if (preg_match('/(http|https):\/\/(www\.)?good\.(com|net)/i', $matches[0])) return $matches[0];
    return preg_replace('/((http|https):\/\/|www\.|\.com|\.net)/i', '', $matches[0]);
}, $string);

答案 1 :(得分:0)

这可能会对您有所帮助:

$url = "www.good.net/tooooo.php";
$regex = array('/(https?:..)/','/^www\./','/(\.com.|\.net.|\.co.)+([^\s]+)/');
$url = preg_replace($regex, '', $url);
echo $url;

答案 2 :(得分:0)

你应该使用非常强大的REGEX,这里很容易做到这一步:

  1. 使用preg_replace_callback
  2. 匹配所有网址
  3. 在回调函数中,检测它是否属于列入白名单的域(preg_match或strrpos)
  4. 仍处于回调函数中:处理后果并将其返回
  5. 网址的正则表达式:

    #^(https?|ftp):\/\/(-\.)?([^\s\/?\.#]+\.?)+(\/[^\s]*)?$#