JavaScript-为什么拆分会不断返回空元素?

时间:2018-11-21 08:17:09

标签: javascript regex split

我正在尝试编写一个JS代码,该代码将一个句子拆分为一个元素数组,但没有所有特殊字符(例如逗号,点,感叹号和问号等)。但是,当我尝试显示所有元素的列表时,会有空行。我该如何摆脱呢?

这是我的代码:

function Splitter() {
    var sentence = "Here is a sentence, with commas and with other characters, such as dots. And numbers 123 45 6!?";
    var chars = [' ', '\\\+', '-', '\\\(', '\\\)', '\\*', '/', ':', '\\\?', '!', '\\\,', '\\\.'];
    var parts = sentence.trim().split(new RegExp(chars.join('|'), 'g'));
    var longestIndex = -1;
    var longestWord = 0;
    
    for(var i=0; i < parts.length; i++){
        if(parts[i].length > longestWord){
            longestWord = parts[i].length;
            longestIndex = i;
        }
    }
    
    document.write("<b>Original sentence:</b><br>" + sentence);
    
    document.write("<br><br><b>how many words in sentence:</b> " + parts.length);
    
    document.write("<br><br><b>the longest word is:</b> " + parts[longestIndex] + "<br>(number of characters in this word: " + longestWord + ")");
    
    document.write("<br><br><b>fifth word:</b> " + parts[4]);
    
    document.write("<br><br><b>words:</b><br><ol>");
    
    for(var k=0; k<parts.length; k++) { 
        document.write("<li>" + parts[k] + "</li>"); 
    }
    
    document.write("</ol>");
    
}
    
Splitter();

它统计单词并显示最长的单词,但是当显示所有元素时,结果显示为空行(原始句子中的逗号或感叹号)。 “第五个单词”也显示空值。

我在这里做什么错了?

2 个答案:

答案 0 :(得分:1)

您可以将sudo gem uninstall --all链接到.filter(Boolean)结果以删除那些空字符串结果。

请注意,您可以简化正则表达式。除了使用管道之外,您还可以将所有令人反感的字符放入正则表达式类中,如下所示:

split

通过在末尾添加额外的/[ +\-()*\/:?!,.]+/g ,您还可以部分解析空字符串返回,除了在开头和结尾处有可能返回的字符串外,因此您仍然需要过滤器。

要完全避免必须执行过滤器,可以使用+代替match,但使用否定的类(split)。在这里,您必须在末尾使用[^

+

最后,还要考虑var parts = sentence.match(/[^ +\-()*\/:?!,.]+/g); 。该条件比您当前所处的条件更为严格,因为它将仅保留字母数字字符:

\w+

答案 1 :(得分:0)

之所以会这样,是因为您使用点和空格作为分隔符,但是按照.split的设计方式,该方法将使用分隔符来分隔字符串,并且不会在任何地方包含分隔符。

您可以使用正向前瞻(?=来检查模式是否存在并且与之匹配,这样它将包含在结果中。

function Splitter() {
    var sentence = "Here is a sentence, with commas and with other characters, such as dots. And numbers 123 45 6!?";
    var chars = [' ', '\\\+', '-', '\\\(', '\\\)', '\\*', '/', ':', '\\\?', '!', '\\\,', '\\\.'];
    var parts = sentence.trim().split(new RegExp('(?='+chars.join('|')+')', 'g'));
    var longestIndex = -1;
    var longestWord = 0;
    
    for(var i=0; i < parts.length; i++){
        if(parts[i].length > longestWord){
            longestWord = parts[i].length;
            longestIndex = i;
        }
    }
    
    document.write("<b>Original sentence:</b><br>" + sentence);
    
    document.write("<br><br><b>how many words in sentence:</b> " + parts.length);
    
    document.write("<br><br><b>the longest word is:</b> " + parts[longestIndex] + "<br>(number of characters in this word: " + longestWord + ")");
    
    document.write("<br><br><b>fifth word:</b> " + parts[4]);
    
    document.write("<br><br><b>words:</b><br><ol>");
    
    for(var k=0; k<parts.length; k++) { 
        document.write("<li>" + parts[k] + "</li>"); 
    }
    
    document.write("</ol>");
    
}
    
Splitter();


编辑

也许我误解了这个问题,如果您只想排除空白字符,则可以使用+选项匹配特殊字符,即使它们彼此靠近也是如此:

function Splitter() {
    var sentence = "Here is a sentence, with commas and with other characters, such as dots. And numbers 123 45 6!?";
    var chars = [' ', '\\\+', '-', '\\\(', '\\\)', '\\*', '/', ':', '\\\?', '!', '\\\,', '\\\.'];
    var parts = sentence.trim().split(new RegExp('[('+chars.join('|')+')]+', 'g'));
    var longestIndex = -1;
    var longestWord = 0;
    
    for(var i=0; i < parts.length; i++){
        if(parts[i].length > longestWord){
            longestWord = parts[i].length;
            longestIndex = i;
        }
    }
    
    document.write("<b>Original sentence:</b><br>" + sentence);
    
    document.write("<br><br><b>how many words in sentence:</b> " + parts.length);
    
    document.write("<br><br><b>the longest word is:</b> " + parts[longestIndex] + "<br>(number of characters in this word: " + longestWord + ")");
    
    document.write("<br><br><b>fifth word:</b> " + parts[4]);
    
    document.write("<br><br><b>words:</b><br><ol>");
    
    for(var k=0; k<parts.length; k++) { 
        document.write("<li>" + parts[k] + "</li>"); 
    }
    
    document.write("</ol>");
    
}
    
Splitter();