使用Java正则表达式将字符串分解为字符串列表

时间:2018-04-27 08:31:36

标签: java regex

我们说我有一个C:\Users\nhs>jupyter notebook --ip=172.17.67.24 Traceback (most recent call last): File "C:\ProgramData\Anaconda3\Scripts\jupyter-notebook-script.py", line 10, in <module> sys.exit(main()) File "C:\ProgramData\Anaconda3\lib\site-packages\jupyter_core\application.py", line 267, in launch_instance return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\traitlets\config\application.py", line 657, in launch_instance app.initialize(argv) File "<decorator-gen-7>", line 2, in initialize File "C:\ProgramData\Anaconda3\lib\site-packages\traitlets\config\application.py", line 87, in catch_config_error return method(app, *args, **kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\notebook\notebookapp.py", line 1296, in initialize self.init_webapp() File "C:\ProgramData\Anaconda3\lib\site-packages\notebook\notebookapp.py", line 1120, in init_webapp self.http_server.listen(port, self.ip) File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\tcpserver.py", line 142, in listen sockets = bind_sockets(port, address=address) File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\netutil.py", line 197, in bind_sockets sock.bind(sockaddr) OSError: [WinError 10049] The requested address is not valid in its context 形式的字符串。我使用正则表达式来查找字符串"xyz<ABC><ABC><ABC>pqr"。我想将其分解为<ABC>,使列表包含List<String>(注意 - 此处的顺序很重要)。

如何使用正则表达式将其分解为["xyz","<ABC>","<ABC>","<ABC>","pqr"]

另一件事,是正则表达式匹配最佳方式吗?

5 个答案:

答案 0 :(得分:1)

您可以使用String.split和regex lookarounds<ABC>之前或之后的零宽度位置进行拆分:

String regex = "(?<=<ABC>)|(?=<ABC>)";
String input = "xyz<ABC><ABC><ABC>pqr";
List<String> answer = Arrays.asList(input.split(regex));
System.out.println(answer);

这是一个略有不同的解决方案,使用带有MatchResults stream的正则表达式(需要Java-9):

Pattern p = Pattern.compile("(?:(?!<ABC>).)++|(?:<ABC>)");
String input = "xyz<ABC><ABC><ABC>pqr";
List<String> answer = p.matcher(input).results()
        .map(MatchResult::group).collect(Collectors.toList());
System.out.println(answer);

<强>输出:

  

[xyz,&lt; ABC&gt;,&lt; ABC&gt;,&lt; ABC&gt;,pqr]

答案 1 :(得分:0)

这里有一些可以帮到你的东西。 基本上,你匹配,然后得到组。

    String str = "xyz<ABC><ABC><ABC>pqr";
    String regexPattern = "(.*)<(.*)><(.*)><(.*)>(.*)";

    Pattern pattern = Pattern.compile(regexPattern );
    Matcher matcher = pattern.matcher(str);
    if (matcher.matches()){
        for (int i=1; i <= matcher.groupCount(); i++){
            System.out.println(matcher.group(i));
        }
    }

这将输出:

  

xyz

     

ABC

     

ABC

     

ABC

     

pqr

然后您可以根据需要将其放入List

答案 2 :(得分:0)

是的:您可以使用正则表达式:

private static List<String> splitString(String input) {
    List<String> result = new ArrayList<>();
    Pattern re = Pattern.compile("<[^>]*>");
    Matcher matcher = re.matcher(input);
    int pos = 0;
    while (matcher.find()) {
        if (matcher.start() > pos) {
            result.add(input.substring(pos, matcher.start()));
        }
        result.add(matcher.group());
        pos = matcher.end();
    }
    if (pos < input.length()) {
        result.add(input.substring(pos));
    }
    return result;
}

答案 3 :(得分:0)

        List<String> list = new LinkedList<>();
        String str = "xyz<ABC><ABC><ABC>pqr";
        Pattern pattern = Pattern.compile("^(\\w+)(<\\w+>)(<\\w+>)(<\\w+>)(\\w+)");
        Matcher matcher = pattern.matcher(str);
        if (matcher.matches()){
            for (int i=1; i <= matcher.groupCount(); i++){
                list.add(matcher.group(i));
            }
        }
        System.out.println(list);

将输出:

[xyz, <ABC>, <ABC>, <ABC>, pqr]

答案 4 :(得分:-1)

yourString.split(regex)将为您提供一个String数组,而不是String列表。如果您确实需要String列表,可以使用Arrays.asList将数组转换为List。但是,如果你的正则表达式只是匹配分隔符(<ABC>字符串),结果将不包括分隔符。