我需要一个tokenizer,它给出一个字符串,其中包含单词之间的任意空格,将创建一个没有空子字符串的单词数组。
例如,给定一个字符串:
" I dont know what you mean by glory Alice said."
我用:
str2.split(" ")
这也会返回空的子字符串:
["", "I", "dont", "know", "what", "you", "mean", "by", "glory", "", "Alice", "said."]
如何从数组中过滤掉空字符串?
答案 0 :(得分:15)
您可能甚至不需要过滤,只需使用此正则表达式进行拆分:
" I dont know what you mean by glory Alice said.".split(/\b\s+/)
答案 1 :(得分:8)
str.match(/\S+/g)
返回非空间序列["I", "dont", "know", "what", "you", "mean", "by", "glory", "Alice", "said."]
的列表(请注意,这包括“说”中的点。)
str.match(/\w+/g)
返回所有字词的列表:["I", "dont", "know", "what", "you", "mean", "by", "glory", "Alice", "said"]
答案 2 :(得分:7)
你应该在使用split之前修剪字符串。
var str = " I dont know what you mean by glory Alice said."
var trimmed = str.replace(/^\s+|\s+$/g, '');
trimmed = str.split(" ")
答案 3 :(得分:2)
我建议.match
:
str.match(/\b\w+\b/g);
这匹配单词边界之间的单词,因此所有空格都不匹配,因此不包含在结果数组中。
答案 4 :(得分:0)
答案 5 :(得分:0)
我认为空子字符串会发生,因为有多个空格可以在for循环中使用replace()来用一个空格替换多个空格,然后使用split()来分割程序像这样的单一空白区域:
// getting full program from div
var program = document.getElementById("ans").textContent;
//removing multiple spaces
var res = program.replace(" ", " ");
for (i = 0; i <= program.length; i++) {
var res = res.replace(" ", " ");
}
// spliting each word using space as saperator
var result = res.split(" ");
&#13;