Question

我有这样的完整链接：

http://localhost:8080/suffix/rest/of/link

如何在Java中编写正则表达式，它只返回带有后缀的网址的主要部分：http://localhost/suffix而没有：/rest/of/link？

可能的协议：http，https
可能的端口：很多可能性

我假设我需要在第3次出现'/'标记（包括）后删除整个文本。我想这样做，但我不知道正则表达式，你能帮忙请问如何正确编写正则表达式吗？

String appUrl = fullRequestUrl.replaceAll("(.*\\/{2})", ""); //this removes 'http://' but this is not my case

Answer 1

我不确定为什么要使用正则表达式。 Java为您提供了 Query URL Objects 。

以下示例摘自同一site以展示其工作原理：

import java.net.*;
import java.io.*;

public class ParseURL {
    public static void main(String[] args) throws Exception {

        URL aURL = new URL("http://example.com:80/docs/books/tutorial"
                           + "/index.html?name=networking#DOWNLOADING");

        System.out.println("protocol = " + aURL.getProtocol());
        System.out.println("authority = " + aURL.getAuthority());
        System.out.println("host = " + aURL.getHost());
        System.out.println("port = " + aURL.getPort());
        System.out.println("path = " + aURL.getPath());
        System.out.println("query = " + aURL.getQuery());
        System.out.println("filename = " + aURL.getFile());
        System.out.println("ref = " + aURL.getRef());
    }
}

以下是程序显示的输出：

protocol = http
authority = example.com:80
host = example.com
port = 80
path = /docs/books/tutorial/index.html
query = name=networking
filename = /docs/books/tutorial/index.html?name=networking
ref = DOWNLOADING

Answer 2

代码获取URL的主要部分：

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexpExample {
    public static void main(String[] args) {
        String urlStr  = "http://localhost:8080/suffix/rest/of/link";
        Pattern pattern = Pattern.compile("^((.*:)//([a-z0-9\\-.]+)(|:[0-9]+)/([a-z]+))/(.*)$");

        Matcher matcher = pattern.matcher(urlStr);
        if(matcher.find())
        {
            //there is a main part of url with suffix:
            String mainPartOfUrlWithSuffix = matcher.group(1);
            System.out.println(mainPartOfUrlWithSuffix);
        }
    }
}

如何通过正则表达式删除URL的某些部分？

2 个答案: