java正则表达式提取直到特殊字符

时间:2012-03-22 16:51:06

标签: java regex

String str = "Text0TEXT1.more text ";
String str = "Text0TEXT1(more text ";
String str = "Text0TEXT1{more text ";

如果我有一条线,它可以是几个字符,例如。或(或{或;我如何才提取TEXT1?

更新:文本1之前有文本0,特殊字符可能存在,也可能不存在

更新2

String str = "Beginning text Text I want . Text I don't want"
String str = "Beginning text with numbers Text I want ( Text I don't want )"
String str = "Beginning text with numbers Text I want { Text I don't want }"

我需要提取“我想要的文字”,但我得到了其余的文字直到最后。特殊字符是。 ({:

3 个答案:

答案 0 :(得分:3)

怎么样:

^(?:[a-zA-Z ]+[0-9]+ )?([a-zA-Z ,]+)

您想要的文字在第1组。

<强>解释

^                 : begining of string
  (?:             : start non capture group
    [a-zA-Z ]+    : one or more letter or space
    [0-9]+        : one or more digit
                  : a space
  )?              : end of group optional
  (               : start capture group 1
    [a-zA-Z ,]+   : one or more letter, sapce or coma
  )               : end of group

答案 1 :(得分:0)

str.split("[^\\w\\s]+")[0]

这将匹配行开头的所有连续[a-zA-Z_0-9]字符和空格

List<String> str = new ArrayList<String>();
str.add("TEXT1.more text ");
str.add("TEXT1)more text ");
str.add("TEXT1}more text ");
str.add("Beginning text Text I want . Text I don't want");
str.add("Beginning text with numbers Text I want ( Text I don't want )");
str.add("Beginning text with numbers Text I want { Text I don't want }");
for(String s : str)
    System.out.println("input: [" + s + "], first word: " + s.split("[^\\w\\s]+")[0]);

产生

input: [TEXT1.more text ], first word: TEXT1
input: [TEXT1)more text ], first word: TEXT1
input: [TEXT1}more text ], first word: TEXT1
input: [Beginning text Text I want . Text I don't want], first word: Beginning text Text I want 
input: [Beginning text with numbers Text I want ( Text I don't want )], first word: Beginning text with numbers Text I want 
input: [Beginning text with numbers Text I want { Text I don't want }], first word: Beginning text with numbers Text I want 

答案 2 :(得分:0)

我设置了一个简单的例子,通过使用正向前瞻匹配正则表达式来解决你的正则表达式:

[\w ]+(?=[.{(;])

上面的正则表达式将在特殊字符之前提取部分。

编辑:

TEXT0部分是否有特定的模式?