我有以下字符串:
String input = "Remove from em?ty sentence 1? Remove from sentence 2! But not from ip address 190.168.10.110!";
我想删除正确位置的标点符号。我的输出需要是:
String str = "Remove from em?ty sentence 1 Remove from sentence 2 But not from ip address 190.168.10.110";
我使用以下代码:
while (stream.hasNext()) {
token = stream.next();
char[] tokenArray = token.toCharArray();
token = token.trim();
if(token.matches(".*?[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}[\\.\\?!]+")){
System.out.println("case2");
stream.previous();
int len = token.length()-1;
for(int i = token.length()-1; i>7; i--){
if(tokenArray[i]=='.'||tokenArray[i]=='?'||tokenArray[i]=='!'){
--len;
}
else
break;
}
stream.set(token.substring(0, len+1));
}
else if(token.matches(".*?\\b[a-zA-Z_0-9]+\\b[\\.\\?!]+")){
System.out.println("case1");
stream.previous();
str = token.replaceAll("[\\.\\?!]+", "");
stream.set(str);
System.out.println(stream.next());
}
}
'令牌'是从'输入'字符串发送的。你能否指出我在正则表达式或逻辑方面做错了什么?
标点符号在结束句子时被视为一个标点符号,不在ip地址中,而不在!true
,emp?ty
之类的单词中(不管它们)。也可以跟一个空格或字符串结尾。
答案 0 :(得分:1)
您可以使用此模式:
\\p{Punct}(?=\\s|$)
并替换为零。
示例:
String subject = "Remove from em?ty sentence 1? Remove from sentence 2! But not from ip address 190.168.10.110!";
String regex = "\\p{Punct}(?=\\s|$)";
String result = subject.replaceAll(regex, "");
System.out.println(result);
答案 1 :(得分:0)
String input = "Remove from em?ty sentence 1? Remove from sentence 2! But not from ip address 190.168.10.110!";
System.out.println(input.replaceAll("[?!]", ""));
给出输出:
Remove from emty sentence 1 Remove from sentence 2 But not from ip address 190.168.10.110
答案 2 :(得分:0)
为什么不使用
string.replaceAll("[?!] ", ""));
答案 3 :(得分:0)
我会反过来做。
if(token.matches("[\\.\\!\\:\\?\\;] "){
token.replace("");
}
现在,我假设标点符号会有一个尾随空格。它只留下句子中的最后一个标点符号,你可以单独删除。
答案 4 :(得分:0)
这样的事可能有用。它排除了一切,然后采用标点符号
对你很重要。 [,.!?]
只需用$ 1替换
# ([^\pL\pN\s]*[\pL\pN](?:[\pL\pN_-]|\pP(?=[\pL\pN\pP_-]))*)|[,.!?]
# "([^\\pL\\pN\\s]*[\\pL\\pN](?:[\\pL\\pN_-]|\\pP(?=[\\pL\\pN\\pP_-]))*)|[,.!?]"
( # (1 start)
[^\pL\pN\s]* [\pL\pN]
(?:
[\pL\pN_-]
| \pP
(?= [\pL\pN\pP_-] )
)*
) # (1 end)
|
[,.!?]