Question

让我们说我们有这个文本

85公斤的家伙咆哮了10英里然后我们可以看到他只是印刷和排版行业的虚拟文本，所有这些都在2小时内完成

我们希望捕获：

85公斤
10英里
2小时

我正在尝试一种可以检索som属性的函数（已知的属性，corse）

让我们说要检测：

属性：[金额] [mesure]

我们的mesures是：

[miles, seconds, hours, minutes, times, kilos]

所以我想在空白空间中爆炸文本，检查数组（文件中）是否有单词，如果前一个单词是数字，那么我有一个属性：D

（这是一种伪/ javascript代码）

function get_mesure_attrs(txt){
     var text = txt.split(' ');
     for (i=1;i<=text.length;i++{    /*Note i begin with i=1 cause the first word would never be a mesure of a desired atribute */
         if(text[i] is in_array(mesures){
            if(is_number(text[i-1]){
                console.log('Atribute: '+text[i-1]+' '+text[i]);
            }
     }
}

我对相关的联合阵列没有足够的熟悉，所以我想知道是否有人可以给我一个提示，

非常感谢

Answer 1

我建议使用正则表达式：

function getMeasureAttrs(txt) {
  var re = /(\d+)\s+(miles|seconds|hours|minutes|times|kilos)/g;
  var match;
  while (match = re.exec(txt)) {
    console.log('Attribute: ' + match[1] + ' ' + match[2]);
  }
}

正则表达式中的两个括号内的部分是匹配的。第一个(\d+)表示整数，第二个是您指定的单位列表。

Answer 2

var str= "The 85 kilos guy rant 10 miles and then we can se he is simply dummy text of the printing and typesetting industry and all of this in 2 hours 1 kilo",
measures = "mile|second|hour|minute|time|kilo";
function getMeasureAttrs(txt) {
  var re = RegExp( "\\b(\\d+)\\s(("+ measures +")s?)","g" );
  var attrs = [];
  txt.replace( re, function  ( $, $1, $2 ) {
    attrs.push ([ $1, $2 ] );
  })
  return attrs;
}
console.log(  getMeasureAttrs( str ) ); // [["85", "kilos"], ["10", "miles"], ["2", "hours"],["1","kilo"]]

提前申请

从纯文本中检索属性

2 个答案: