文字未正确拆分

时间:2019-08-18 15:46:32

标签: java regex split

我正在尝试从字符串中提取文本和十六进制颜色。

它目前在符号“>”

中有一个小问题

这是我当前得到的代码;

package main.cache;

import java.util.Arrays;
import java.util.regex.Pattern;

public class Main {

    public static void extract(String string) { 
        final String STARTS_WITH_COLOR_LITERAL = "^[A-Fa-f0-9]{6}|[A-Fa-f0-9]{3}";
        final Pattern pattern = Pattern.compile(STARTS_WITH_COLOR_LITERAL);
        Object[] objects = Arrays.stream(string.split("<col=")).filter(part -> pattern.matcher(part).find()).toArray();
        String name;
        String color = null;
        for (int i = 0; i < objects.length; i++) {
            String[] line = objects[i].toString().split(">");
            if (line.length == 1) {
                name = line[0];
            } else {
                color = line[0];
                name = line[1];
            }
            System.out.println("Color " + color + ", name " + name);
        }
    }

    public static void main(String[] args) {
        extract("something before<col=ff00ff>mercides> car<col=ffff00>plates");
    }
}

例如,传递此参数时,预期输出为

Color null, name something before
Color ff00ff, name mercides> car
Color ffff00, name plates

我当前得到的输出是

Color null, name something before
Color ff00ff, name mercides
Color ffff00, name plates

1 个答案:

答案 0 :(得分:0)

如果您正在寻找color/name对(按此顺序),则可以使用:(?><col=(?<color>[A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})>)?(?<text>(?><.*?>)?[^<]+)

static void extract(String string) {
    Pattern pattern =
            Pattern.compile("(?><col=(?<color>[A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})>)?(?<name>(?><.*?>)?[^<]+)");
    Matcher m = pattern.matcher(string);
    while (m.find()) {
        String color = m.group("color");
        String name = m.group("name");
        System.out.printf("Color %s, name %s\n", color, name);
    }
}
/*
Color null, name something before
Color ff00ff, name mercides> car
Color ffff00, name plates

正则表达式详细信息,请参见Regex101以获取详细信息(请参见右侧的说明)

  1. (?><col=(?<color>[A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})>)?用于颜色,以<col开头,以>结尾,里面带有六个字符,它是可选的,因此末尾有一个?

  2. 名称的
  3. (?<text>(?><.*?>)?[^<]+)组可以加入另一个balise,但是最后一个字符可以是<