Question

试图在堆栈上找到答案溢出，但我是一个完整的RegEx菜鸟！

我需要的所有内容（如果可能的话）是匹配某些HTML中的PDF网址，如果它不是以http://开头添加/content/开头，如果它始于{{ 1}}什么也不做。

Answer 1

你想要的正则表达式可能是

http://add/content/\S+?\.pdf

其中说它必须以“http：// add / content /”开头，然后可以包含任何不是空格的内容，直到它最后达到.pdf。根据您使用的语言，您需要以不同的方式应用此语言。例如在php中它将是

preg_match_all('|http://add/content/\S+?\.pdf|',$html,$matches);
if(count($matches)) {
     //do stuff with the matches in the $matches array
} else {
     //there were no matches of that form
}

Answer 2

假设你想用javascript做这件事。

var links = document.getElementsByTagName("a");
for(var i = 0; i < links.length; i++){
    var link = links[i];
    var href = link.getAttribute("href");
    if(!/^http/.test(href))
    {
        link.setAttribute("href", "/content/" + href);
    }
}

仅当url不以http：//开头时，正则表达式才用字符串替换部分url

2 个答案: