在java中使用regex查找和替换url

时间:2015-05-09 17:31:46

标签: java regex url replace

我正在尝试使用String.replace用正则表达式替换url,代码在

下面
public class Test {
    public static void main(String[] args) {
        String test = "https://google.com";
        //String regex = "\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";
        String regex = "(http?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]"; // does not match <http://google.com>

        String newText = test.replace(regex, "");
        System.out.println(newText);
    }
}

我在SO中已经研究了几个问题,但它并没有取代模式。有人可以告诉我如何实现这一目标?

2 个答案:

答案 0 :(得分:2)

settings.py不接受正则表达式。请改用String.replaceAll

String.replace()

就正则表达式而言,您应该匹配String newText = test.replaceAll(regex, "");

https

答案 1 :(得分:2)

您无法使用replace的正则表达式,而是使用replaceAll,即:

   String test = "something https://google.com something";
    try {
        String newText = test.replaceAll("(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]", "");
        System.out.println(newText);
    } catch (PatternSyntaxException ex) {
        // Syntax error in the regular expression
    } catch (IllegalArgumentException ex) {
        // Syntax error in the replacement text (unescaped $ signs?)
    } catch (IndexOutOfBoundsException ex) {
        // Non-existent backreference used the replacement text
    }

输出:

something  something

现场演示:

http://ideone.com/Yi2hrb

正则表达式说明:

(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]

Options: Case sensitive; Exact spacing; Dot doesn’t match line breaks; ^$ don’t match at line breaks; Default line breaks; Regex syntax only

Match the regex below and capture its match into backreference number 1 «(https?|ftp|file)»
   Match this alternative «https?»
      Match the character string “http” literally «http»
      Match the character “s” literally «s?»
         Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
   Or match this alternative «ftp»
      Match the character string “ftp” literally «ftp»
   Or match this alternative «file»
      Match the character string “file” literally «file»
Match the character string “://” literally «://»
Match a single character present in the list below «[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*»
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
   The literal character “-” «-»
   A character in the range between “a” and “z” «a-z»
   A character in the range between “A” and “Z” «A-Z»
   A character in the range between “0” and “9” «0-9»
   A single character from the list “+&@#/%?=~_|!:,.;” «+&@#/%?=~_|!:,.;»
Match a single character present in the list below «[-a-zA-Z0-9+&@#/%=~_|]»
   The literal character “-” «-»
   A character in the range between “a” and “z” «a-z»
   A character in the range between “A” and “Z” «A-Z»
   A character in the range between “0” and “9” «0-9»
   A single character from the list “+&@#/%=~_|” «+&@#/%=~_|»