Unformat格式化字符串

时间:2015-06-02 22:42:03

标签: java string format printf string-formatting

我有一个简单的格式化字符串:

double d = 12.348678;
int i = 9876;
String s = "ABCD";
System.out.printf("%08.2f%5s%09d", d, s, i);

// %08.2f = '12.348678' -> '00012,35'
// %5s = 'ABCD' -> ' ABCD'
// %09d = '9876' -> '000009876'
// %08.2f%5s%09d = '00012,35 ABCD000009876'

当我知道模式:%08.2f%5s%09d和字符串:00012,35 ABCD000009876时: 我可以" unformat"这个字符串在某种程度上?

例如。预期的结果就像3个令牌:' 00012,35',' ABCD',' 000009876'

5 个答案:

答案 0 :(得分:1)

这是针对您的模式的。 formattring的一般解析器(因为我们称之为unformatting就是解析)看起来会有很大不同。

public class Unformat {

    public static Integer getWidth(Pattern pattern, String format) {
        Matcher matcher = pattern.matcher(format);
        if (matcher.find()) {
            return Integer.valueOf(matcher.group(1));
        }
        return null;
    }

    public static String getResult(Pattern p, String format, String formatted,
            Integer start, Integer width) {
        width = getWidth(p, format);
        if (width != null) {
            String result = formatted.substring(start, start + width);
            start += width;
            return result;
        }
        return null;
    }

    public static void main(String[] args) {
        String format = "%08.2f%5s%09d";
        String formatted = "00012.35 ABCD000009876";
        String[] formats = format.split("%");

        List<String> result = new ArrayList<String>();
        Integer start = 0;
        Integer width = 0;

        for (int j = 1; j < formats.length; j++) {
            if (formats[j].endsWith("f")) {
                Pattern p = Pattern.compile(".*([0-9])+\\..*f");
                result.add(getResult(p, formats[j], formatted, start, width));
            } else if (formats[j].endsWith("s")) {
                Pattern p = Pattern.compile("([0-9])s");
                result.add(getResult(p, formats[j], formatted, start, width));
            } else if (formats[j].endsWith("d")) {
                Pattern p = Pattern.compile("([0-9])d");
                result.add(getResult(p, formats[j], formatted, start, width));
            }
        }
        System.out.println(result);
    }

}

答案 1 :(得分:1)

根据"%08.2f%5s%09d"的输出格式判断,它似乎与此模式相当

"([0-9]{5,}[\\.|,][0-9]{2,})(.{5,})([0-9]{9,})"

尝试以下方法:

public static void main(String[] args) {
    String data = "00012,35 ABCD000009876";
    Matcher matcher = Pattern.compile("([0-9]{5,}[\\.|,][0-9]{2,})(.{5,})([0-9]{9,})").matcher(data);

    List<String> matches = new ArrayList<>();
    if (matcher.matches()) {
        for (int i = 1; i <= matcher.groupCount(); i++) {
            matches.add(matcher.group(i));
        }
    }

    System.out.println(matches);
}

结果:

[00012,35,  ABCD, 000009876]

更新

在看到评论之后,这里是一个不使用RegularExpressions的通用示例,而不是复制@bpgergo(通过RegularExpressions方法给你+1)。如果格式超出数据宽度,还会添加一些逻辑。

public static void main(String[] args) {
    String data = "00012,35 ABCD000009876";
    // Format exceeds width of data
    String format = "%08.2f%5s%09d%9s";
    String[] formatPieces = format.replaceFirst("^%", "").split("%");

    List<String> matches = new ArrayList();

    int index = 0;
    for (String formatPiece : formatPieces) {   
        // Remove any argument indexes or flags 
        formatPiece = formatPiece.replaceAll("^([0-9]+\\$)|[\\+|-|,|<]", "");

        int length = 0;
        switch (formatPiece.charAt(formatPiece.length() - 1)) {
            case 'f':
                if (formatPiece.contains(".")) {
                    length = Integer.parseInt(formatPiece.split("\\.")[0]);
                } else {
                    length = Integer.parseInt(formatPiece.substring(0, formatPiece.length() - 1));
                }
                break;
            case 's':
                length = Integer.parseInt(formatPiece.substring(0, formatPiece.length() - 1));
                break;
            case 'd':
                length = Integer.parseInt(formatPiece.substring(0, formatPiece.length() - 1));
                break;
        }

        if (index + length < data.length()) {                
            matches.add(data.substring(index, index + length));
        } else {
            // We've reached the end of the data and need to break from the loop
            matches.add(data.substring(index));
            break;
        }
        index += length;
    }
    System.out.println(matches);
}

结果:

[00012,35,  ABCD, 000009876]

答案 2 :(得分:0)

您可以这样做:

//Find the end of the first value, 
//this value will always have 2 digits after the decimal point.
int index = val.indexOf(".") + 3;
String tooken1 = val.substring(0, index);

//Remove the first value from the original String
val = val.substring(index);

//get all values after the last non-numerical character.
String tooken3 = val.replaceAll(".+\\D", "");

//remove the previously extracted value from the remainder of the original String.
String tooken2 = val.replace(tooken3, "");

如果String值最后包含一个数字,并且可能在某些其他情况下,则会失败。

答案 3 :(得分:0)

如您所知,这意味着您正在处理某种正则表达式。使用它们来满足您的需求。

Java为这些任务提供了不错的正则表达式API

正则表达式可以有捕获组,每个组都可以根据需要使用单个“未格式化”的部分。一切都取决于你将使用/创建的正则表达式。

答案 4 :(得分:-1)

最简单的方法是使用带有myString.replaceAll()的正则表达式解析字符串。 myString.split(&#34;,&#34;)也可能有助于将字符串拆分为字符串数组