我正在寻找从JS的大型字符串中查找并返回特定文本的最有效方法。
特定文本的规则是以
"ID_"
开头并结束的文本 与".pdf"
。
假设我有这样的字符串(它的简短版本):
<ul>
<li><a href="/questions/237104/ID_2556.pdf">Click here to
download.</a></li>
<li><a href="/questions/237104/ID_37.pdf">Click
here to download.</a></li>
<li><a
href="/questions/237104/ID_29997.pdf">Click here to download.</a></li>
<li><a href="/questions/237104/ID_0554.pdf">Click here to
download.</a></li>
</ul>
脚本应将这些单独的值作为字符串返回:
ID_2556.pdf
ID_37.pdf
ID_29997.pdf
ID_0554.pdf
答案 0 :(得分:1)
您可以使用String.prototype.match
获取所有匹配的字符串:
var html = `
<ul>
<li><a href="/questions/237104/ID_2556.pdf">Click here to
download.</a></li>
<li><a href="/questions/237104/ID_37.pdf">Click
here to download.</a></li>
<li><a
href="/questions/237104/ID_29997.pdf">Click here to download.</a></li>
<li><a href="/questions/237104/ID_0554.pdf">Click here to
download.</a></li>
</ul>
`;
console.log(html.match(/ID_.*?pdf/g))
答案 1 :(得分:0)
您可能想将正则表达式用于此任务const test = [
{
id: 0,
count: 0,
image: "",
text: 'Some text about finn'
},
{
id: 1,
count: 0,
image: "",
text: 'Some text about daphne'
},
{
id: 2,
count: 0,
image: "",
text: 'Some text finn'
},
{
id: 3,
count: 0,
image: "",
text: 'Some text daphne'
}
]
:
这里是游乐场:https://regex101.com/r/mD5Yt3/1
它将为您生成代码:
/ID_.*?\.pdf/gm
答案 2 :(得分:0)
一种选择是使用DOMParser
将HTML字符串转换为文档,然后选择以a
结尾的.pdf
,找出符合要求格式的文件,然后将它们推送到数组:
const htmlStr = `<ul>
<li><a href="/questions/237104/ID_2556.pdf">Click here to
download.</a></li>
<li><a href="/questions/237104/ID_37.pdf">Click
here to download.</a></li>
<li><a
href="/questions/237104/ID_29997.pdf">Click here to download.</a></li>
<li><a href="/questions/237104/ID_0554.pdf">Click here to
download.</a></li>
</ul>`;
const doc = new DOMParser().parseFromString(htmlStr, 'text/html');
const filenames = [...doc.querySelectorAll('a[href$=".pdf"]')]
.reduce((filenames, { href }) => {
const match = href.match(/ID_\d+\.pdf/);
if (match) filenames.push(match[0]);
return filenames;
}, []);
console.log(filenames);
如果您想减少代码,也可以在reduce
内部而不是在选择器字符串中进行所有过滤,尽管这样可能会降低效率:
const filenames = [...doc.querySelectorAll('a')]
...