如何使用正则表达式提取字符串?

时间:2016-11-22 07:53:26

标签: javascript regex

假设我有以下文字

for (;;);{"__ar":1,"__sf":"k","payload":null,"domops":[["appendContent","^div.fbProfileBrowserListContainer",true,{"__html":"\u003Cdiv class=\"fbProfileBrowserList expandedList\" id=\"100008123852509\">\u003Cul class=\"uiList clearfix _5bbv _4kg _704 _4ks\">\u003Cli class=\"fbProfileBrowserListItem\">\u003Cdiv class=\"clearfix _5qo4\">\u003Ca class=\"_8o _8t lfloat _ohe\" href=\"https:\/\/www.facebook.com\/tasvirmanepal\/?fref=pb\" tabindex=\"-1\" aria-hidden=\"true\">\u003Cimg class=\"_s0 _rw img\" src=\"https:\/\/fb-s-b-a.akamaihd.net\/h-ak-xta1\/v\/t1.0-1\/c13.0.50.50\/p50x50\/12246747_952307471472706_3389977056619055535_n.jpg?oh=fdcd99bd098ad7d60b67701358bdbc97&oe=58D45D87&__gda__=1489984135_7befa40c475cf7f2a6aa021e97d7f429\" alt=\"\" \/>\u003C\/a>\u003Cdiv class=\"clearfix _42ef\">\u003Cdiv class=\"_6a rfloat _ohf\">\u003Cdiv class=\"_6a _6b\" style=\"height:50px\">\u003C\/div>\u003Cdiv class=\"_6a _6b\">\u003Cdiv class=\"_5t4x\">\u003Cspan

现在,我想提取像

这样的字符串

src=\"https:\/\/fb-s-b-a.akamaihd.net\/h-ak-xta1\/v\/t1.0-1\/c13.0.50.50\/p50x50\/12246747_952307471472706_3389977056619055535_n.jpg?oh=fdcd99bd098ad7d60b67701358bdbc97&oe=58D45D87&__gda__=1489984135_7befa40c475cf7f2a6aa021e97d7f429\"

2 个答案:

答案 0 :(得分:0)

正则表达式适用于逻辑,首先定义逻辑然后编写正则表达式。

说:

  • 匹配是否在双引号之间,并且不包含引号?
  • 匹配前面有src=\"

http://regexr.com/等网站上,您可以轻松测试正则表达式。

这是一个可能的正则表达式:src=\\\"[^"]*"但它不会帮助你处理极端情况。

它基本上匹配以src\"开头的所有内容,然后接受任何(零个或多个)字符,双引号"除外,还包括双引号。

答案 1 :(得分:0)

请试试这个:



const regex = /src=\\"[^"]*?\\"/g;
const str = `for (;;);{"__ar":1,"__sf":"k","payload":null,"domops":[["appendContent","^div.fbProfileBrowserListContainer",true,{"__html":"\\u003Cdiv class=\\"fbProfileBrowserList expandedList\\" id=\\"100008123852509\\">\\u003Cul class=\\"uiList clearfix _5bbv _4kg _704 _4ks\\">\\u003Cli class=\\"fbProfileBrowserListItem\\">\\u003Cdiv class=\\"clearfix _5qo4\\">\\u003Ca class=\\"_8o _8t lfloat _ohe\\" href=\\"https:\\/\\/www.facebook.com\\/tasvirmanepal\\/?fref=pb\\" tabindex=\\"-1\\" aria-hidden=\\"true\\">\\u003Cimg class=\\"_s0 _rw img\\" src=\\"https:\\/\\/fb-s-b-a.akamaihd.net\\/h-ak-xta1\\/v\\/t1.0-1\\/c13.0.50.50\\/p50x50\\/12246747_952307471472706_3389977056619055535_n.jpg?oh=fdcd99bd098ad7d60b67701358bdbc97&oe=58D45D87&__gda__=1489984135_7befa40c475cf7f2a6aa021e97d7f429\\" alt=\\"\\" \\/>\\u003C\\/a>\\u003Cdiv class=\\"clearfix _42ef\\">\\u003Cdiv class=\\"_6a rfloat _ohf\\">\\u003Cdiv class=\\"_6a _6b\\" style=\\"height:50px\\">\\u003C\\/div>\\u003Cdiv class=\\"_6a _6b\\">\\u003Cdiv class=\\"_5t4x\\">\\u003Cspan`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}