Question

我有一个.txt输入文件，如下所示：

Start "String" (100, 100) Test One:
  Nextline 10;
  Test Second Third(2, 4, 2, 4):
    String "7";
    String "8";
    Test "";
  End;
End.

我打算将此文件作为一个String读取，然后根据某些分隔符将其拆分。我几乎用这段代码满足了所需的输出：

String tr=  entireFile.replaceAll("\\s+", "");

String[] input = tr.split("(?<=[(,):;.])|(?=[(,):;.])|(?=\\p{Upper})");

我目前的输出是：

Start"
String"
(
100
,
100
)
Test
One
:
Nextline10
;
Test
Second
Third
(
2
,
4
,
2
,
4
)
:
String"7"
;
String"8"
;
Test""
;
End
;
End
.

但是，我在处理引号内的项目或仅将引号“”作为单独的标记时遇到问题。所以“String”和“7”和“”都应该在不同的行上。有没有办法用正则表达式做到这一点？我的预期产量低于，谢谢你的帮助。

Start
"String"
(
100
,
100
)
Test
One
:
Nextline
10
;
Test
Second
Third
(
2
,
4
,
2
,
4
)
:
String
"7"
;
String
"8"
;
Test
""
;
End
;
End
.

Answer 1

这是我提出的正则表达式：

String[] input = entireFile.split(
        "\\s+|" +           // Splits on whitespace or 
        "(?<=\\()|" +         // splits on the positive lookbehind ( or
        "(?=[,).:;])|" +  // splits on any of the positive lookaheads ,).:; or
        "((?<!\\s)(?=\\())"); // splits on the positive lookahead ( with a negative lookbehind whitespace

要了解所有正面/负面的前瞻/外观术语，请查看this answer。

请注意，您应该将此拆分直接应用于输入文件而不删除空格，即取出以下行：

String tr=  entireFile.replaceAll("\\s+", "");

在Java中拆分字符串，保留包括引号内的项目的分隔符

1 个答案: