我是java regex的新手。请帮帮我。 请考虑以下段落,
段落:
Name abc
sadghsagh
hsajdjah Name
ggggggggg
!!!
Name ggg
dfdfddfdf Name
!!!
Name hhhh
sahdgashdg Name
asjdhjasdh
sadasldkalskd
asdjhakjsdhja
!!!
我需要将以上段落拆分为以Name开头并以!!!结尾的文本块。在这里,我不想用!作为拆分段落的唯一分隔符。我需要在我的正则表达式中包含起始序列(Name)。
即,我的结果api应该看起来像SplitAsBlocks(“段落”,“以名称开头”,“结束于 !!!“)
如何实现这一点,请任何人帮助我......
现在我想要与Brito给出相同的输出......但是在这里我在“hsajdjah”之后添加了Name。这里它将文本拆分为beow:
Name
ggggggggg
!!!
但我需要
Name abc
sadghsagh
hsajdjah Name
ggggggggg
!!!
这是我必须匹配名称,它位于行的起点,而不是中间。
请建议我......
Bart ...请参阅以下输入案例以获取代码...
我需要使用带有参数start =>的ur API拆分以下内容。名称和结束=> ! 但输出变化..i只有3个块以Name开头并以!结尾! 。 我也附加了输出。
String myInput = "Name hhhhh class0"+ "\n"+
"HHHHHHHHHHHHHHHHHH"+ "\n"+
"!"+ "\n"+
"Name TTTTT TTTT"+ "\n"+
"GGGGGG UUUUU IIII"+ "\n"+
"!"+ "\n"+
"Name JJJJJ WWWW"+ "\n"+
"IIIIIIIIIIIIIIIIIIIII"+ "\n"+
"!"+ "\n"+
"RRRRRRRRRRR TTTTTTTT"+ "\n"+
"HHHHHH"+ "\n"+
"JJJJJ 1 Name class1"+ "\n"+
"LLLLL 5 Name class5"+ "\n"+
"!"+ "\n"+
"OOOOOO HHHH FFFFFF"+ "\n"+
"service 0 Name class12"+ "\n"+
"!"+ "\n"+
"JJJJJ YYYYYY 3/0"+ "\n"+
"KKKKKKK"+ "\n"+
"UUU UUU UUUUU"+ "\n"+
"QQQQQQQ"+ "\n"+
"!";
String[] tokens = tokenize(myInput, "Name", "!");
int n = 0;
for(String t : tokens) {
System.out.println("---------------------------\n"+(++n)+"\n"+t);
}
OutPut:
---------------------------
1
Name hhhhh class0
HHHHHHHHHHHHHHHHHH
!
---------------------------
2
Name TTTTT TTTT
GGGGGG UUUUU IIII
!
---------------------------
3
Name JJJJJ WWWW
IIIIIIIIIIIIIIIIIIIII
!
---------------------------
4
Name class1
LLLLL 5 Name class5
!
---------------------------
5
Name class12
!
这里我只需要在行首而不是中间的名字... 如何为此添加正则表达式...
答案 0 :(得分:4)
尝试:
import java.util.*;
import java.util.regex.*;
public class Main {
public static String[] tokenize(String text, String start, String end) {
// old line:
//Pattern p = Pattern.compile("(?s)"+Pattern.quote(start)+".*?"+Pattern.quote(end));
// new line:
Pattern p = Pattern.compile("(?sm)^"+Pattern.quote(start)+".*?"+Pattern.quote(end)+"$");
Matcher m = p.matcher(text);
List<String> tokens = new ArrayList<String>();
while(m.find()) {
tokens.add(m.group());
}
return tokens.toArray(new String[]{});
}
public static void main(String[] args) {
String text = "Name abc" + "\n" +
"sadghsagh" + "\n" +
"hsajdjah Name" + "\n" +
"ggggggggg" + "\n" +
"!!!" + "\n" +
"Name ggg" + "\n" +
"dfdfddfdf Name" + "\n" +
"!!!" + "\n" +
"Name hhhh" + "\n" +
"sahdgashdg Name" + "\n" +
"asjdhjasdh" + "\n" +
"sadasldkalskd" + "\n" +
"asdjhakjsdhja" + "\n" +
"!!!";
String[] tokens = tokenize(text, "Name", "!!!");
int n = 0;
for(String t : tokens) {
System.out.println("---------------------------\n"+(++n)+"\n"+t);
}
}
}
答案 1 :(得分:3)
String s = "Name abc sadghsagh hsajdjah !!! Name ggg dfdfddfdf !!! Name hhhh sahdgashdg asjdhjasdh sadasldkalskd asdjhakjsdhja !!!!! ";
String startsWith = "Name";
String endsWith = "!!!";
// non-greedily get all groups starting with Name and ending with !!!
String pattern = String.format("(%s).*?(%s)", Pattern.quote(startsWith), Pattern.quote(endsWith));
System.out.println(pattern);
Matcher m = Pattern.compile(pattern, Pattern.DOTALL).matcher(s);
while (m.find())
System.out.println(m.group());
输出:
(\QName\E).*?(\Q!!!\E)
Name abc sadghsagh hsajdjah !!!
Name ggg dfdfddfdf !!!
Name hhhh sahdgashdg asjdhjasdh sadasldkalskd asdjhakjsdhja !!!
答案 2 :(得分:0)
如果您想在结果中保留Name
和!!!
,也应该执行以下操作。
<击> String [] parts = string.split(“(?=(Name | !!!))”); 击>
修改:这是更正后的版本:
String[] parts = string.split("(?<=!!!)\\s*(?=Name)");
这将在!!!
和Name
之间的任何空格上分开,而不会分开;特此保留这两个部分。如果您不希望在!!!Name
上拆分,请将\\s*
替换为\\s+
以允许一对多匹配,而不是零对多匹配。
Edit2 :附上输入/输出的示例。输入是从topicstart复制的:
String string = "Name hhhhh class0" + "\n" + "HHHHHHHHHHHHHHHHHH" + "\n" + "!" + "\n"
+ "Name TTTTT TTTT" + "\n" + "GGGGGG UUUUU IIII" + "\n" + "!" + "\n"
+ "Name JJJJJ WWWW" + "\n" + "IIIIIIIIIIIIIIIIIIIII" + "\n" + "!" + "\n"
+ "RRRRRRRRRRR TTTTTTTT" + "\n" + "HHHHHH" + "\n" + "JJJJJ 1 Name class1" + "\n"
+ "LLLLL 5 Name class5" + "\n" + "!" + "\n" + "OOOOOO HHHH FFFFFF" + "\n"
+ "service 0 Name class12" + "\n" + "!" + "\n" + "JJJJJ YYYYYY 3/0" + "\n" + "KKKKKKK"
+ "\n" + "UUU UUU UUUUU" + "\n" + "QQQQQQQ" + "\n" + "!";
String[] parts = string.split("(?<=!)\\s*(?=Name)");
for (String part : parts) {
System.out.println(part);
System.out.println("---------------------------------");
}
输出:
Name hhhhh class0
HHHHHHHHHHHHHHHHHH
!
---------------------------------
Name TTTTT TTTT
GGGGGG UUUUU IIII
!
---------------------------------
Name JJJJJ WWWW
IIIIIIIIIIIIIIIIIIIII
!
RRRRRRRRRRR TTTTTTTT
HHHHHH
JJJJJ 1 Name class1
LLLLL 5 Name class5
!
OOOOOO HHHH FFFFFF
service 0 Name class12
!
JJJJJ YYYYYY 3/0
KKKKKKK
UUU UUU UUUUU
QQQQQQQ
!
---------------------------------
看起来不错?