这可能听起来像一个非常简单的问题,但是如何从字符串中删除多个不同的字符而不必为每个字符写一行,这是我费力的做法。我在下面写了一个字符串示例:
String word = "Hello, t-his is; an- (example) line."
word = word.replace(",", "");
word = word.replace(".", "");
word = word.replace(";", "");
word = word.replace("-", "");
word = word.replace("(", "");
word = word.replace(")", "");
System.out.println(word);
哪会产生" Hello this is an example line
"。一种更有效的方法是?
答案 0 :(得分:4)
使用
word = word.replaceAll("[,.;\\-()]", "");
请注意,特殊字符-
(连字符)应该由双反斜杠转义,否则会被视为构造范围。
答案 1 :(得分:2)
虽然效率不如原始replace
技术,但您可以使用
word = word.replaceAll("\\p{Punct}+", "");
使用replaceAll
的简单表达式替换更广泛的字符
答案 2 :(得分:1)
如果没有(ab)使用正则表达式,我会这样做:
String word = "Hello, t-his is; an- (example) line.";
String undesirable = ",.;-()";
int len1 = undesirable.length();
int len2 = word.length();
StringBuilder sb = new StringBuilder(len2);
outer: for (int j = 0; j < len2; j++) {
char c = word.charAt(j);
for (int i = 0; i < len; i++) {
if (c == undesirable.charAt(i)) continue outer;
}
sb.append(c);
}
System.out.println(sb.toString());
优点是性能。您不需要创建和解析正则表达式的开销。
您可以将其封装在方法中:
public static String removeCharacters(String word, String undesirable) {
int len1 = undesirable.length();
int len2 = word.length();
StringBuilder sb = new StringBuilder(len2);
outer: for (int j = 0; j < len2; j++) {
char c = word.charAt(j);
for (int i = 0; i < len1; i++) {
if (c == undesirable.charAt(i)) continue outer;
}
sb.append(c);
}
return sb.toString();
}
public static String removeSpecialCharacters(String word) {
return removeCharacters(word, ",.;-()");
}
然后,你会这样使用它:
public static void testMethod() {
String word = "Hello, t-his is; an- (example) line.";
System.out.println(removeSpecialCharacters(word));
}
以下是性能测试:
public class WordTest {
public static void main(String[] args) {
int iterations = 10000000;
long t1 = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
testAsArray();
}
long t2 = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
testRegex();
}
long t3 = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
testAsString();
}
long t4 = System.currentTimeMillis();
System.out.println("Without regex, but using copied arrays: " + (t2 - t1));
System.out.println("With precompiled regex: " + (t3 - t2));
System.out.println("Without regex, but using string: " + (t4 - t3));
}
public static void testAsArray() {
String word = "Hello, t-his is; an- (example) line.";
char[] undesirable = ",.;-()".toCharArray();
StringBuilder sb = new StringBuilder(word.length());
outer: for (char c : word.toCharArray()) {
for (char h : undesirable) {
if (c == h) continue outer;
}
sb.append(c);
}
sb.toString();
}
public static void testAsString() {
String word = "Hello, t-his is; an- (example) line.";
String undesirable = ",.;-()";
int len1 = undesirable.length();
int len2 = word.length();
StringBuilder sb = new StringBuilder(len2);
outer: for (int j = 0; j < len2; j++) {
char c = word.charAt(j);
for (int i = 0; i < len1; i++) {
if (c == undesirable.charAt(i)) continue outer;
}
sb.append(c);
}
sb.toString();
}
private static final Pattern regex = Pattern.compile("[,\\.;\\-\\(\\)]");
public static void testRegex() {
String word = "Hello, t-his is; an- (example) line.";
String result = regex.matcher(word).replaceAll("");
}
}
我机器上的输出:
Without regex, but using copied arrays: 5880
With precompiled regex: 11011
Without regex, but using string: 3844
答案 3 :(得分:0)
您可以尝试使用Java的String.replaceAll方法使用正则表达式:
word = word.replaceAll(",|\.|;|-|\(|\)", "");
如果您不熟悉正则表达式,|意思是“或”。所以我们基本上是说,或者。要么 ;或 - 或(或)。
查看更多:Java documentation for String.replaceAll
修改强>
如上所述,我以前的版本不会编译。只是为了正确起见(尽管已经指出这不是最佳解决方案),这是我的正则表达式的更正版本:
word = word.replaceAll(",|\\.|;|-|\\(|\\)", "");
答案 4 :(得分:0)
这是一个以最小的努力完成这项工作的解决方案; toRemove
字符串包含您不希望在输出中看到的所有字符:
public static String removeChars(final String input, final String toRemove)
{
final StringBuilder sb = new StringBuilder(input.length());
final CharBuffer buf = CharBuffer.wrap(input);
char c;
while (buf.hasRemaining()) {
c = buf.get();
if (toRemove.indexOf(c) == -1)
sb.append(c);
}
return sb.toString();
}
如果你使用Java 8,你甚至可以使用它(不幸的是那里没有CharStream
所以必须使用强制转换......):
public static String removeChars(final String input, final String toRemove)
{
final StringBuilder sb = new StringBuilder(input.length());
input.chars().filter(c -> toRemove.indexOf((char) c) == -1)
.forEach(i -> sb.append((char) i));
return sb.toString();
}