如果超时,取消正则表达式匹配

时间:2016-08-09 20:05:06

标签: javascript regex node.js

如果完成时间超过10秒,是否可以取消regex.match操作?

我正在使用一个巨大的正则表达式来匹配特定的文本,有时可能会有效,有时可能会失败......

正则表达式:MINISTÉRIO(?:[^P]*(?:P(?!ÁG\s:\s\d+\/\d+)[^P]*)(?:[\s\S]*?))PÁG\s:\s+\d+\/(\d+)\b(?:\D*(?:(?!\1\/\1)\d\D*)*)\1\/\1(?:[^Z]*(?:Z(?!6:\s\d+)[^Z]*)(?:[\s\S]*?))Z6:\s+\d+

工作示例:https://regex101.com/r/kU6rS5/1

所以..我想取消操作,如果需要超过10秒。可能吗?我在沙发中找不到任何相关内容

感谢。

2 个答案:

答案 0 :(得分:1)

您可以生成一个执行正则表达式匹配的子进程,如果它在10秒内没有完成则将其终止。可能有点矫枉过正,但它应该有效。

fork可能是你应该使用的,如果你走这条路。

如果你原谅我的非纯函数,这段代码将展示你如何在分叉的子进程和主进程之间来回沟通的要点:

index.js

const { fork } = require('child_process');
const processPath = __dirname + '/regex-process.js';
const regexProcess = fork(processPath);
let received = null;

regexProcess.on('message', function(data) {
  console.log('received message from child:', data);
  clearTimeout(timeout);
  received = data;
  regexProcess.kill(); // or however you want to end it. just as an example.
  // you have access to the regex data here.
  // send to a callback, or resolve a promise with the value,
  // so the original calling code can access it as well.
});

const timeoutInMs = 10000;
let timeout = setTimeout(() => {
  if (!received) {
    console.error('regexProcess is still running!');
    regexProcess.kill(); // or however you want to shut it down.
  }
}, timeoutInMs);

regexProcess.send('message to match against');

正则表达式-process.js

function respond(data) {
  process.send(data);
}

function handleMessage(data) {
  console.log('handing message:', data);
  // run your regex calculations in here
  // then respond with the data when it's done.

  // the following is just to emulate
  // a synchronous computational delay
  for (let i = 0; i < 500000000; i++) {
    // spin!
  }
  respond('return regex process data in here');
}

process.on('message', handleMessage);

但这可能最终掩盖了真正的问题。您可能需要像其他海报建议的那样考虑重新编写正则表达式。

答案 1 :(得分:1)

我在这里找到的另一个解决方案: https://www.josephkirwin.com/2016/03/12/nodejs_redos_mitigation/

基于VM的使用,没有进程fork。 挺好看的。

    const util = require('util');
    const vm = require('vm');

    var sandbox = {
        regex:/^(A+)*B/,
        string:"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC",
        result: null
    };

    var context = vm.createContext(sandbox);
    console.log('Sandbox initialized: ' + vm.isContext(sandbox));
    var script = new vm.Script('result = regex.test(string);');
    try{
        // One could argue if a RegExp hasn't processed in a given time.
        // then, its likely it will take exponential time.
        script.runInContext(context, { timeout: 1000 }); // milliseconds
    } catch(e){
        console.log('ReDos occurred',e); // Take some remedial action here...
    }

    console.log(util.inspect(sandbox)); // Check the results