正则表达式上有一些选择题的JavaScript

时间:2017-10-22 19:22:52

标签: javascript regex

我试图解析Anki抽认卡的.docx多选题。

我想从docx中组合一个问题'匹配组。和'答案'。

在尝试提问时,我无法按照我想要的方式获得正则表达式。

示例问题:

    1.  Hormones and other signal molecules bind with ____ affinities to their receptors and are produced at concentrations ____ their KD values.
a.
low; far above
b.
moderate; far above
c.
moderate; equivalent to
d.
high; far below
e.
very high; equivalent to


ANS:    E    

    2.  Steroid hormones, such as glucocorticoids, effect their action by:
a.
binding to a plasma membrane receptor, which stimulates a signal transduction pathway within the cell
b.
binding to a plasma membrane receptor, which stimulates the receptor to enter the cell
c.
entering into the cell and affecting the production of secondary messengers
d.
entering into the cell and then acting as transcription regulators
e.
both a and d are correct


ANS:    E    

    3.  All are unifying features of polypeptide hormones EXCEPT that they are:
a.
originally synthesized with signal sequences.
b.
synthesized as inactive preprohormones.
c.
activated from preprohormones to hormones by phosphorylation.
d.
may produce several different peptide hormones with suitable processing.
e.
all are true.


ANS:    C    

    4.  Each of the following statements is true EXCEPT:
a.
epinephrine is an amino acid derivative
b.
steroid hormones can enter cells and regulate transcription
c.
insulin is a polypeptide hormone
d.
progesterone is a polypeptide hormone
e.
all of the above are true


ANS:    D    

    5.  The acrosome reaction, involving ion channel induced release of acrosomal enzymes used by sperm to attack the egg, is induced by:
a.
estrogen.
b.
testosterone.
c.
dihydrotestosterone (DHT).
d.
progesterone.
e.
cortisol.


ANS:    D    

我正在使用/(\d.)[^\d][^ANS$]+/gm,但在解析整个文档时,它会不断跳过第2和第3个问题。

任何建议都将受到赞赏。

3 个答案:

答案 0 :(得分:1)

(\d.)[^\d][^ANS$]+模式匹配一​​个数字(\d)后跟任何字符(.),然后跟上除数字([^\d])以外的任何字符,然后跟随除了A之外的任何一个或多个字符,NS$[^...]是一个否定的字符类,其中$失去了它特殊含义,它匹配类中的单个字符,而不是序列。)

要修复正则表达式,您可以使用

/^\s*(\d+\..*(?:\r?\n(?!\s*ANS:).*)*)\r?\n\s*(ANS:.*)/gm

请参阅regex demo

<强>详情

  • ^ - 开始一行(因为m修饰符使^匹配行的开头而不是整个字符串)
  • \s* - 0个或更多空格
  • ( - 第1组开始:
    • \d+ - 一位或多位
    • \. - 一个点
    • .* - 其余部分
    • (?:\r?\n(?!\s*ANS:).*)* - 连续出现0次或以上
      • \r?\n - CRLF或LF换行符......
      • (?!\s*ANS:) - 后面跟着0+空格(\s*ANS:子字符串((?!...)是一个负面预测,如果失败,则会失败 它的模式立即位于当前位置的右侧)
      • .* - 其余部分
  • ) - 第1组结束
  • \r?\n - 换行符
  • \s* - 0+ whitespaces
  • (ANS:.*) - 第2组捕获ANS:和其余部分。

JS演示:

&#13;
&#13;
var rx = /^\s*(\d+\..*(?:\r?\n(?!\s*ANS:).*)*)\r?\n\s*(ANS:.*)/gm;
var s = "1.  Hormones and other signal molecules bind with ____ affinities to their receptors and are produced at concentrations ____ their KD values.\r\na.\r\nlow; far above\r\nb.\r\nmoderate; far above\r\nc.\r\nmoderate; equivalent to\r\nd.\r\nhigh; far below\r\ne.\r\nvery high; equivalent to\r\n\r\n\r\nANS:    E    \r\n\r\n    2.  Steroid hormones, such as glucocorticoids, effect their action by:\r\na.\r\nbinding to a plasma membrane receptor, which stimulates a signal transduction pathway within the cell\r\nb.\r\nbinding to a plasma membrane receptor, which stimulates the receptor to enter the cell\r\nc.\r\nentering into the cell and affecting the production of secondary messengers\r\nd.\r\nentering into the cell and then acting as transcription regulators\r\ne.\r\nboth a and d are correct\r\n\r\n\r\nANS:    E    \r\n\r\n    3.  All are unifying features of polypeptide hormones EXCEPT that they are:\r\na.\r\noriginally synthesized with signal sequences.\r\nb.\r\nsynthesized as inactive preprohormones.\r\nc.\r\nactivated from preprohormones to hormones by phosphorylation.\r\nd.\r\nmay produce several different peptide hormones with suitable processing.\r\ne.\r\nall are true.\r\n\r\n\r\nANS:    C    \r\n\r\n    4.  Each of the following statements is true EXCEPT:\r\na.\r\nepinephrine is an amino acid derivative\r\nb.\r\nsteroid hormones can enter cells and regulate transcription\r\nc.\r\ninsulin is a polypeptide hormone\r\nd.\r\nprogesterone is a polypeptide hormone\r\ne.\r\nall of the above are true\r\n\r\n\r\nANS:    D    \r\n\r\n    5.  The acrosome reaction, involving ion channel induced release of acrosomal enzymes used by sperm to attack the egg, is induced by:\r\na.\r\nestrogen.\r\nb.\r\ntestosterone.\r\nc.\r\ndihydrotestosterone (DHT).\r\nd.\r\nprogesterone.\r\ne.\r\ncortisol.\r\n\r\n\r\nANS:    D  "
var m;
var qst=[], ans=[];
while (m = rx.exec(s)) {
  qst.push(m[1].trim());
  ans.push(m[2].trim());
}
document.body.innerHTML += "<pre>" + JSON.stringify(qst, 0, 4) + "</pre>";
document.body.innerHTML += "<pre>" + JSON.stringify(ans, 0, 4) + "</pre>";
&#13;
&#13;
&#13;

答案 1 :(得分:1)

如果您只想选择问题,可以直接使用:

node['java']['webapps'].each do |name, app_attrs|
  conf_file = app_attrs.read('enabled', 'x', 'conf')
  if conf_file
    template .. same stuff here
  end
end

如果您想选择每个问题及其答案,可以使用:

/\d.+/gm

这是假设问题和答案中不包含数字。

答案 2 :(得分:0)

您还可以尝试更简单的解决方案:

fs.readdir

除了(\d\.)[\s\S]*?(?=ANS)

之外,它不会被破坏

演示:https://regex101.com/r/Qnwl8b/2/