在JS中从字符串中查找特定文本的最有效方法?

时间:2018-11-17 23:27:00

标签: javascript

我正在寻找从JS的大型字符串中查找并返回特定文本的最有效方法。

  

特定文本的规则是以"ID_"开头并结束的文本   与".pdf"

假设我有这样的字符串(它的简短版本):

<ul>
<li><a href="/questions/237104/ID_2556.pdf">Click here to
download.</a></li>
<li><a href="/questions/237104/ID_37.pdf">Click
here to download.</a></li>
<li><a
href="/questions/237104/ID_29997.pdf">Click here to download.</a></li>
<li><a href="/questions/237104/ID_0554.pdf">Click here to
download.</a></li>
</ul>

脚本应将这些单独的值作为字符串返回:

  

ID_2556.pdf

     

ID_37.pdf

     

ID_29997.pdf

     

ID_0554.pdf

3 个答案:

答案 0 :(得分:1)

您可以使用String.prototype.match获取所有匹配的字符串:

var html = `
<ul>
<li><a href="/questions/237104/ID_2556.pdf">Click here to
download.</a></li>
<li><a href="/questions/237104/ID_37.pdf">Click
here to download.</a></li>
<li><a
href="/questions/237104/ID_29997.pdf">Click here to download.</a></li>
<li><a href="/questions/237104/ID_0554.pdf">Click here to
download.</a></li>
</ul>
`;

console.log(html.match(/ID_.*?pdf/g))

答案 1 :(得分:0)

您可能想将正则表达式用于此任务const test = [ { id: 0, count: 0, image: "", text: 'Some text about finn' }, { id: 1, count: 0, image: "", text: 'Some text about daphne' }, { id: 2, count: 0, image: "", text: 'Some text finn' }, { id: 3, count: 0, image: "", text: 'Some text daphne' } ]

这里是游乐场:https://regex101.com/r/mD5Yt3/1

它将为您生成代码:

/ID_.*?\.pdf/gm

答案 2 :(得分:0)

一种选择是使用DOMParser将HTML字符串转换为文档,然后选择以a结尾的.pdf,找出符合要求格式的文件,然后将它们推送到数组:

const htmlStr = `<ul>
<li><a href="/questions/237104/ID_2556.pdf">Click here to
download.</a></li>
<li><a href="/questions/237104/ID_37.pdf">Click
here to download.</a></li>
<li><a
href="/questions/237104/ID_29997.pdf">Click here to download.</a></li>
<li><a href="/questions/237104/ID_0554.pdf">Click here to
download.</a></li>
</ul>`;

const doc = new DOMParser().parseFromString(htmlStr, 'text/html');
const filenames = [...doc.querySelectorAll('a[href$=".pdf"]')]
  .reduce((filenames, { href }) => {
    const match = href.match(/ID_\d+\.pdf/);
    if (match) filenames.push(match[0]);
    return filenames;
  }, []);
console.log(filenames);

如果您想减少代码,也可以在reduce内部而不是在选择器字符串中进行所有过滤,尽管这样可能会降低效率:

const filenames = [...doc.querySelectorAll('a')]
  ...