我想解析以下数据,以便获得下面指定的输出。
输入:
RTRV-ALM-EQPT::ALL:RA01; SIMULATOR 09-11-20 13:52:15 M RA01 COMPLD "SLOT-1-1-1,CMP:MN,T-FANCURRENT-1-HIGH,NSA,01-10-09,00-00-00,,:\"Fan-T\"," "SLOT-1-1-1,CMP:MJ,T-BATTERYPWR-2-LOW,NSA,01-10-09,00-00-00,,:\"Battery-T\"," "SLOT-1-1-2,CMP:CR,PROC_FAIL,SA,09-11-20,13-51-55,,:\"Processor Failure\"," "SLOT-1-1-3,OLC:MN,T-LASERCURR-1-HIGH,SA, 01-10-07,13-21-03,,:\"Laser-T\"," "SLOT-1-1-3,OLC:MJ,T-LASERCURR-2-LOW,NSA, 01-10-02,21-32-11,,:\" Laser-T\"," "SLOT-1-1-4,OLC:MN,T-LASERCURR-1-HIGH,SA,01-10-05,02-14-03,,:\"Laser-T\"," "SLOT-1-1-4,OLC:MJ,T-LASERCURR-2-LOW,NSA,01-10-04,01-03-02,,:\"Laser-T\"," ;
输出:
1) RTRV-ALM-EQPT::ALL:RA01; 2) SIMULATOR 3) 09-11-20 4) 13:52:15 5) M 6) RA01 7) COMPLD 8) "SLOT-1-1-1,CMP:MN,T-FANCURRENT-1-HIGH,NSA,01-10-09,00-00-00,,:\"Fan-T\"," 9) "SLOT-1-1-1,CMP:MJ,T-BATTERYPWR-2-LOW,NSA,01-10-09,00-00-00,,:\"Battery-T\"," 10) "SLOT-1-1-2,CMP:CR,PROC_FAIL,SA,09-11-20,13-51-55,,:\"Processor Failure\"," 11) "SLOT-1-1-3,OLC:MN,T-LASERCURR-1-HIGH,SA, 01-10-07,13-21-03,,:\"Laser-T\"," 12) "SLOT-1-1-3,OLC:MJ,T-LASERCURR-2-LOW,NSA, 01-10-02,21-32-11,,:\" Laser-T\"," 13) "SLOT-1-1-4,OLC:MN,T-LASERCURR-1-HIGH,SA,01-10-05,02-14-03,,:\"Laser-T\"," 14) "SLOT-1-1-4,OLC:MJ,T-LASERCURR-2-LOW,NSA,01-10-04,01-03-02,,:\"Laser-T\","
答案 0 :(得分:1)
最好的方法可能不是考虑将第一个文本转换为第二个文本。
相反,首先考虑将第一个文本解析为一组Java对象,表示它们实际上是什么。例如,输入的第二行/第三行可能由Test
类表示,其中包含“area”,“day”和“time”属性。 (只有你可以根据你对一切意义的了解,提出一个合理的模型。)
然后,一旦你有一个很好的内存中的文件信息表示,你可以考虑打印到文本,如第二种情况。现在应该很容易从Java对象中打印出各种字段和属性,而不是试图动态转换输入文本。
答案 1 :(得分:1)
假设文件相对较小,因此可以读入内存。尝试这样的事情:
public class Main {
public static void main(String[] args) {
String text = "RTRV-ALM-EQPT::ALL:RA01;\n"+
"\n"+
" SIMULATOR 09-11-20 13:52:15\n"+
"M RA01 COMPLD\n"+
" \"SLOT-1-1-1,CMP:MN,T-FANCURRENT-1-HIGH,NSA,01-10-09,00-00-00,,:\\\"Fan-T\\\",\"\n"+
" \"SLOT-1-1-1,CMP:MJ,T-BATTERYPWR-2-LOW,NSA,01-10-09,00-00-00,,:\\\"Battery-T\\\",\"\n"+
" \"SLOT-1-1-2,CMP:CR,PROC_FAIL,SA,09-11-20,13-51-55,,:\\\"Processor Failure\\\",\"\n"+
" \"SLOT-1-1-3,OLC:MN,T-LASERCURR-1-HIGH,SA, 01-10-07,13-21-03,,:\\\"Laser-T\\\",\"\n"+
" \"SLOT-1-1-3,OLC:MJ,T-LASERCURR-2-LOW,NSA, 01-10-02,21-32-11,,:\\\" Laser-T\\\",\"\n"+
" \"SLOT-1-1-4,OLC:MN,T-LASERCURR-1-HIGH,SA,01-10-05,02-14-03,,:\\\"Laser-T\\\",\"\n"+
" \"SLOT-1-1-4,OLC:MJ,T-LASERCURR-2-LOW,NSA,01-10-04,01-03-02,,:\\\"Laser-T\\\",\"\n"+
";";
Matcher m = Pattern.compile("\"(?:\\\\.|[^\\\"])*\"|\\S+").matcher(text);
int n = 0;
while(m.find()) {
System.out.println((++n)+") "+m.group());
}
}
}
输出:
1) RTRV-ALM-EQPT::ALL:RA01;
2) SIMULATOR
3) 09-11-20
4) 13:52:15
5) M
6) RA01
7) COMPLD
8) "SLOT-1-1-1,CMP:MN,T-FANCURRENT-1-HIGH,NSA,01-10-09,00-00-00,,:\"Fan-T\","
9) "SLOT-1-1-1,CMP:MJ,T-BATTERYPWR-2-LOW,NSA,01-10-09,00-00-00,,:\"Battery-T\","
10) "SLOT-1-1-2,CMP:CR,PROC_FAIL,SA,09-11-20,13-51-55,,:\"Processor Failure\","
11) "SLOT-1-1-3,OLC:MN,T-LASERCURR-1-HIGH,SA, 01-10-07,13-21-03,,:\"Laser-T\","
12) "SLOT-1-1-3,OLC:MJ,T-LASERCURR-2-LOW,NSA, 01-10-02,21-32-11,,:\" Laser-T\","
13) "SLOT-1-1-4,OLC:MN,T-LASERCURR-1-HIGH,SA,01-10-05,02-14-03,,:\"Laser-T\","
14) "SLOT-1-1-4,OLC:MJ,T-LASERCURR-2-LOW,NSA,01-10-04,01-03-02,,:\"Laser-T\","
15) ;
唯一的区别是第15场比赛:;
,我相信你忘了。
原始正则表达式(没有所有转义)看起来像这样:
"(?:\\.|[^\\"])*"|\S+
和匹配:
" # match a double quote
(?: # open non matching group 1
\\. # match a backslash followed by any char (except line breaks)
| # OR
[^\\"] # match any char except a backslash and a double quote
)* # close non matching group 1 and repeat it zero or more times
" # match a double quote
| # OR
\S+ # match one or more characters other than white space chars
换句话说:匹配带引号的字符串或匹配仅由非空格字符组成的单词。
答案 2 :(得分:0)
要解析任何输入,您必须知道其结构。