Question

我在排除“＃”符号后的部分字符串时遇到了一些困难。

我更好地解释自己：

这是用户可以在文本框中插入的示例输入文本：

Some Text
Some Text again #A comment
#A comment line
Another Text
Another Text again#Comment

我需要阅读此文并忽略“＃”符号后面的所有文字。

这应该是预期的输出：

Some Text;Some Text again;Another Text;Another Text again

至于现在的代码：

用“;”替换所有换行符

readText = userInputTextArea.getText();
readTextAllInALine = readText.replaceAll("\\n", ";");

所以此后的输出是：

Some Text;Some Text again #A comment;#A comment line;Another Text;Another Text again#Comment

此代码将忽略第一个“＃”后面的所有字符，但如果我们按顺序读取它，则仅适用于第一行。

int startIndex = inputCommandText.indexOf("#");
int endIndex = inputCommandText.indexOf(";");
String toBeReplaced = inputCommandText.substring(startIndex, endIndex);
readTextAllInALine.replace(toBeReplaced, "");

我一直在寻找获得预期输出的方法。我想使用StringTokenizer，处理每一行，删除“＃”后的文本或忽略整行，如果以“＃”开头，然后打印所有标记（即所有行），用“;”分隔它们但我不能让它发挥作用。

任何帮助将不胜感激。

非常感谢你。

问候。

Answer 1

在纯字符串上调用此替换命令，从文本输入中检索。正则表达式＃[^;] * 抓取所有内容，从哈希开始直到它读取分号。然后用空字符串替换它。

public static void main(String[] args) {
    String text = "Some Text;Some Text again #A comment;#A comment line;Another Text;Another Text again#Comment";
    System.out.println(text);
    text = text.replaceAll("#[^;]*", "");
    System.out.println(text);
}

Answer 2

正则表达式在这里很有用，但它很棘手，因为你的模式中等复杂。评论是结束行，因此它们可以出现在多个安排中。

我想出了以下两遍：

replaceAll(" *(#.*(?=\\n|$))", "").replaceAll("\\n+", ";");

两遍避开了有时你得到重复换行的事实。第一个表达式替换注释但不替换换行符，第二个表达式用一个分号替换多个换行符。

第一遍中表达式的各个部分如下：

" *"

这包括评论匹配中的零个或多个前导空格。在"...again #A..."中的IE，我们希望删除n和#之间的空格。

"(#.* )"

评论匹配的开头：匹配#后跟零个或多个字符。（通常.匹配除新行之外的任何字符。）

"(?= )“

这是一个积极的前瞻，正则表达式开始变得棘手。它会查找此表达式中的任何内容，但不会将其包含在匹配的文本中。它断言#.*之后是某个字符串，但不替换该字符串。

"\\n|$"

前瞻找到新行或结束锚。这将找到以新行符号或结尾的注释，该注释位于String的末尾。但同样，因为它在前瞻之内，所以新线不会被取代。

所以给出了输入：

String text = (
    "Some Text" + '\n' +
    "Some Text again #A comment" + '\n' +
    "#A comment line" + '\n' +
    "Another Text" + '\n' +
    "Another Text again#Comment"
);

System.out.println(
    text.replaceAll(" *(#.*(?=\\n|$))", "").replaceAll("\\n+", ";")
);

输出结果为：

Some Text;Some Text again;Another Text;Another Text again

Answer 3

为了说清楚，Coxer的答复是要走的路。更加精确和干净。但无论如何，如果你想在这里尝试一个可行的递归解决方案：

public class IgnoreHash {
@Test
public void test() {
    String readTextAllInALine = "Some Text;Some Text again #A comment;#A comment line;Another Text;Another Text again#Comment;";
    String actualResult = removeHashComments(readTextAllInALine);
    Assert.assertEquals(actualResult, "Some Text;Some Text again ;Another Text;Another Text again");

}

private String removeHashComments(String input) {
    StringBuffer result = new StringBuffer();
    int hashIndex = input.indexOf("#");
    int endIndex = input.indexOf(";");

    if(hashIndex != -1){
        result.append(input.substring(0, hashIndex));
        //first line
        if(hashIndex < endIndex ) {
            result.append(removeHashComments(input.substring(endIndex)));
        } // the case of ;#
        else if (endIndex == hashIndex-1) {
            int endIndex2 = input.indexOf(";", hashIndex+1);
            result.append(removeHashComments(input.substring(endIndex2+1)));
        } 
        else {
            result.append(removeHashComments(input.substring(hashIndex)));
        }
    }

    return result.toString();
}

}

Answer 4

readText = userInputTextArea.getText();
readText = readText.replaceAll("\\s*#[^\n]*", "");
readText = readText.replaceAll("\n+", ";");

JAVA - 忽略包含“＃”的字符串的一部分

4 个答案: