我有一个简单的格式化字符串:
double d = 12.348678;
int i = 9876;
String s = "ABCD";
System.out.printf("%08.2f%5s%09d", d, s, i);
// %08.2f = '12.348678' -> '00012,35'
// %5s = 'ABCD' -> ' ABCD'
// %09d = '9876' -> '000009876'
// %08.2f%5s%09d = '00012,35 ABCD000009876'
当我知道模式:%08.2f%5s%09d
和字符串:00012,35 ABCD000009876
时:
我可以" unformat"这个字符串在某种程度上?
例如。预期的结果就像3个令牌:' 00012,35',' ABCD',' 000009876'
答案 0 :(得分:1)
这是针对您的模式的。 formattring的一般解析器(因为我们称之为unformatting就是解析)看起来会有很大不同。
public class Unformat {
public static Integer getWidth(Pattern pattern, String format) {
Matcher matcher = pattern.matcher(format);
if (matcher.find()) {
return Integer.valueOf(matcher.group(1));
}
return null;
}
public static String getResult(Pattern p, String format, String formatted,
Integer start, Integer width) {
width = getWidth(p, format);
if (width != null) {
String result = formatted.substring(start, start + width);
start += width;
return result;
}
return null;
}
public static void main(String[] args) {
String format = "%08.2f%5s%09d";
String formatted = "00012.35 ABCD000009876";
String[] formats = format.split("%");
List<String> result = new ArrayList<String>();
Integer start = 0;
Integer width = 0;
for (int j = 1; j < formats.length; j++) {
if (formats[j].endsWith("f")) {
Pattern p = Pattern.compile(".*([0-9])+\\..*f");
result.add(getResult(p, formats[j], formatted, start, width));
} else if (formats[j].endsWith("s")) {
Pattern p = Pattern.compile("([0-9])s");
result.add(getResult(p, formats[j], formatted, start, width));
} else if (formats[j].endsWith("d")) {
Pattern p = Pattern.compile("([0-9])d");
result.add(getResult(p, formats[j], formatted, start, width));
}
}
System.out.println(result);
}
}
答案 1 :(得分:1)
根据"%08.2f%5s%09d"
的输出格式判断,它似乎与此模式相当
"([0-9]{5,}[\\.|,][0-9]{2,})(.{5,})([0-9]{9,})"
尝试以下方法:
public static void main(String[] args) {
String data = "00012,35 ABCD000009876";
Matcher matcher = Pattern.compile("([0-9]{5,}[\\.|,][0-9]{2,})(.{5,})([0-9]{9,})").matcher(data);
List<String> matches = new ArrayList<>();
if (matcher.matches()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
matches.add(matcher.group(i));
}
}
System.out.println(matches);
}
结果:
[00012,35, ABCD, 000009876]
在看到评论之后,这里是一个不使用RegularExpressions
的通用示例,而不是复制@bpgergo(通过RegularExpressions
方法给你+1)。如果格式超出数据宽度,还会添加一些逻辑。
public static void main(String[] args) {
String data = "00012,35 ABCD000009876";
// Format exceeds width of data
String format = "%08.2f%5s%09d%9s";
String[] formatPieces = format.replaceFirst("^%", "").split("%");
List<String> matches = new ArrayList();
int index = 0;
for (String formatPiece : formatPieces) {
// Remove any argument indexes or flags
formatPiece = formatPiece.replaceAll("^([0-9]+\\$)|[\\+|-|,|<]", "");
int length = 0;
switch (formatPiece.charAt(formatPiece.length() - 1)) {
case 'f':
if (formatPiece.contains(".")) {
length = Integer.parseInt(formatPiece.split("\\.")[0]);
} else {
length = Integer.parseInt(formatPiece.substring(0, formatPiece.length() - 1));
}
break;
case 's':
length = Integer.parseInt(formatPiece.substring(0, formatPiece.length() - 1));
break;
case 'd':
length = Integer.parseInt(formatPiece.substring(0, formatPiece.length() - 1));
break;
}
if (index + length < data.length()) {
matches.add(data.substring(index, index + length));
} else {
// We've reached the end of the data and need to break from the loop
matches.add(data.substring(index));
break;
}
index += length;
}
System.out.println(matches);
}
结果:
[00012,35, ABCD, 000009876]
答案 2 :(得分:0)
您可以这样做:
//Find the end of the first value,
//this value will always have 2 digits after the decimal point.
int index = val.indexOf(".") + 3;
String tooken1 = val.substring(0, index);
//Remove the first value from the original String
val = val.substring(index);
//get all values after the last non-numerical character.
String tooken3 = val.replaceAll(".+\\D", "");
//remove the previously extracted value from the remainder of the original String.
String tooken2 = val.replace(tooken3, "");
如果String值最后包含一个数字,并且可能在某些其他情况下,则会失败。
答案 3 :(得分:0)
如您所知,这意味着您正在处理某种正则表达式。使用它们来满足您的需求。
Java为这些任务提供了不错的正则表达式API
正则表达式可以有捕获组,每个组都可以根据需要使用单个“未格式化”的部分。一切都取决于你将使用/创建的正则表达式。
答案 4 :(得分:-1)
最简单的方法是使用带有myString.replaceAll()的正则表达式解析字符串。 myString.split(&#34;,&#34;)也可能有助于将字符串拆分为字符串数组