使用正则表达式获取多行文本中的值

时间:2015-10-15 19:06:05

标签: javascript regex localization

我有以下情况:

我想在使用Regex的以下JS代码之后的<。> 之后提取所有字符串。

这是我到目前为止所做的:

(\.html\()?"(.+)"\s*\+*
\.html\("(.+)"(\s*\+\s*\n\s*"(.+)")*

但不适用于所有行。

非常感谢任何帮助。

谢谢。

JavaScript代码

     sym.getSymbol("popup").$("main").html("Delamination at Sharp Points and Corners");
     sym.getSymbol("popup").$("sub").html("<span style='font-family: abel_probold'>Defect:</span> Bond separates easily at the tip of a sharp edge or corner.<br><br>" +
         "<span style='font-family: abel_probold'>Common cause:</span> Very little adhesive area to hold the application in place<br><br>" +
         "<span style='font-family: abel_probold'>Corrective action:</span> When possible, eliminate sharp points. Validate bond performance with wash testing.</div>");

3 个答案:

答案 0 :(得分:2)

您可以匹配报价与非报价之间的所有内容:

string.match(/\"([^\"]*)\"/g)

答案 1 :(得分:1)

您可以使用此正则表达式:

string.match(/\.html\((.+?)\)/gs)

https://regex101.com/r/xW5yR3/1

答案 2 :(得分:-1)

如果您只想解析JS,可以使用修改后的c / c ++注释解析器 因为它匹配字符串中的所有文本,所以你只需要坐在循环中 检查Capture Group 1是否匹配。

如果第1组匹配,则您拥有.html("..."。第1组包含引号
加上文本,第2组只是内部文本。

Formatted and tested:

    # (?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\\]|\\\n?)*?\n)|(?:\.html\(("((?:\\[\S\s]|[^"\\])*)")|"(?:\\[\S\s]|[^"\\])*"|[\S\s](?:(?!\.html\()[^/"\\])*)

    (?:                              # Comments 
         /\*                              # Start /* .. */ comment
         [^*]* \*+
         (?: [^/*] [^*]* \*+ )*
         /                                # End /* .. */ comment
      |                                 # or,
         //                               # Start // comment
         (?: [^\\] | \\ \n? )*?           # Possible line-continuation
         \n                               # End // comment
    )
 |                                 # or,
    (?:                              # Non-Comments
         \.html\(
         (                                # (1 start), Html double quoted strings
              "
              (                                # (2 start), Inner text
                   (?: \\ [\S\s] | [^"\\] )*
              )                                # (2 end)
              "
         )                                # (1 end)
      |                                 # or,
         "                                # Other double quoted strings
         (?: \\ [\S\s] | [^"\\] )*
         "
      |                                 # or,
         [\S\s]                           # Any other char
         (?:
              (?! \.html\( )                   # Give htlm strings a chance above
              [^/"\\]                          # Chars which doesn't start a comment, string, escape,
                                               # or line continuation (escape + newline)
         )*
    )

它发现的例子:

 **  Grp 0 -  ( pos 36 , len 48 ) 
.html("Delamination at Sharp Points and Corners"  
 **  Grp 1 -  ( pos 42 , len 42 ) 
"Delamination at Sharp Points and Corners"  
 **  Grp 2 -  ( pos 43 , len 40 ) 
Delamination at Sharp Points and Corners  


 **  Grp 0 -  ( pos 124 , len 130 ) 
.html("<span style='font-family: abel_probold'>Defect:</span> Bond separates easily at the tip of a sharp edge or corner.<br><br>"  
 **  Grp 1 -  ( pos 130 , len 124 ) 
"<span style='font-family: abel_probold'>Defect:</span> Bond separates easily at the tip of a sharp edge or corner.<br><br>"  
 **  Grp 2 -  ( pos 131 , len 122 ) 
<span style='font-family: abel_probold'>Defect:</span> Bond separates easily at the tip of a sharp edge or corner.<br><br>