正则表达式用逗号分隔,除非在字符串或列表`[]`中

时间:2018-04-18 17:04:42

标签: javascript regex split

给出一个字符串,如

1, 'str,ing', [1, 2, [3, 4, 5, 'str,ing']], 'st[rin,g]['

我想基于逗号拆分它,但不包括内部字符串或方括号内的逗号。所以我希望输出是

的列表

1

'str,ing'

[1, 2, [3, 4, 5, 'str,ing']]

st[rin,g]['

最近我得到的是,(?=(?:[^'"[\]]*['"[\]][^'"[\]]*['"[\]])*[^'"[\]]*$),但这并没有意识到]没有关闭'等等。

2 个答案:

答案 0 :(得分:3)

Regex是一种无上下文的语言,这意味着它无法解析基于深度的逻辑(例如,嵌套数组)。如果您正在处理格式错误的数据,则必须对数据做出一些假设,并手动逐步执行数据。

以下是一个示例,假设每个```{r setup, include = FALSE} knitr::opts_chunk$set(echo = FALSE) Feedback <- read.csv(file.choose()) E <- (Feedback$E) "Text is shown here" "## Title of next paragraph" "Text that describes the variable that should be displayed:" `r E` ``` ``` {r} E ``` 应该匹配[,并且]并不特殊。 (检查字符串的结尾以防止失控的循环)

&#13;
&#13;
{}
&#13;
&#13;
&#13;

答案 1 :(得分:1)

基于@Terza上面的回答,添加了一些逻辑来处理字符串中的字符串和括号内的转义引号。

class ParamSplitter {
  constructor(string) {
    this.string = string;
    this.index = -1;
    this.startIndex = 0;
    this.params = [];
  }

  splitByParams() {
    let depth = 0;

    while (this.nextIndex() && (!this.atQuote() || this.skipQuote())) {
      let char = this.string[this.index];
      if (char === '[')
        depth++;
      else if (char === ']')
        depth--;
      else if (char === ',' && !depth) {
        this.addParam();
        this.startIndex = this.index + 1;
      }
    }

    this.addParam();
    return this.params;
  }

  findIndex(regex, start) { // returns -1 or index of match
    let index = this.string.substring(start).search(regex);
    return index >= 0 ? index + start : -1;
  }

  nextIndex() {
    this.index = this.findIndex(/[,'"[\]]/, this.index + 1);
    return this.index !== -1;
  }

  atQuote() {
    let char = this.string[this.index];
    return char === '"' || char === "'";
  }

  skipQuote() {
    let char = this.string[this.index];
    this.index = this.findIndex(char === '"' ? /[^\\]"/ : /[^\\]'/, this.index + 1) + 1;
    return this.index;
  }

  addParam() {
    this.params.push(this.string.substring(this.startIndex, this.index > 0 ? this.index : this.string.length).trim());
  }
}

let run = string => new ParamSplitter(string).splitByParams();
let input = "1, 'str,ing', [1, 2, [3, 4, 5, 'str,ing']], 'st[rin,g][', 'text\\'moretext', ['two', ']', 'three'], 4";
console.log(run(input));