无法绕过这头。我需要使用正则表达式解析它以创建下面的定义列表
Width=3/8 in|Length=1 in|Thread - TPI or Pitch=|Bolt/Screw Length=|Material=|Coating=|Type=Snap-On|Used With=|Quantity=5000 per pack|Wt.=20 lb|Color=
结果将是这样的
<dt>Width</dt>
<dd>3/8 in</dd>
<dt>Length </dt>
<dd>1 Inch</dd>
<dt>Thread - TPI or Pitch</dt>
<dd></dd>
<dt>Quantity</dt>
<dd>5000 a pack</dd>
<dt>Wt.</dt>
<dd>20 lb</dd>
答案 0 :(得分:1)
如果您不需要重新排序项目或更改其值,并确信值本身不包含输入中用作标记的等号或竖线,则可以应用一系列正则表达式来介绍HTML。使用来自Scala的Java的String类,这可能是一个密集但有效的单行:
"Escape test=&<>|Width=3/8 in|Length=1 in|Thread - TPI or Pitch=|Bolt/Screw Length=|Material=|Coating=|Type=Snap-On|Used With=|Quantity=5000 per pack|Wt.=20 lb|Color=".
replaceAll("&","&").
replaceAll("<","<").
replaceAll(">",">").
replaceAll("^","<dl>\n\t<dt>").
replaceAll("=","</dt>\n\t<dd>").
replaceAll("\\|","</dd>\n\n\t<dt>").
replaceAll("$","</dd>\n</dl>")
产生
<dl>
<dt>Escape test</dt>
<dd>&<></dd>
<dt>Width</dt>
<dd>3/8 in</dd>
<dt>Length</dt>
<dd>1 in</dd>
<dt>Thread - TPI or Pitch</dt>
<dd></dd>
<dt>Bolt/Screw Length</dt>
<dd></dd>
<dt>Material</dt>
<dd></dd>
<dt>Coating</dt>
<dd></dd>
<dt>Type</dt>
<dd>Snap-On</dd>
<dt>Used With</dt>
<dd></dd>
<dt>Quantity</dt>
<dd>5000 per pack</dd>
<dt>Wt.</dt>
<dd>20 lb</dd>
<dt>Color</dt>
<dd></dd>
答案 1 :(得分:0)
这样的事情:
/(?:(.*?)=(.*?)(\||$))+/
答案 2 :(得分:0)
您可以使用
([^=|]+)=([^|]+)(?:\||$)
使用“全球”标志。
说明:
( # start match group 1 [^=|]+ # any character that's not a "=" or "|", at least once ) # end match group 1 = # a literal "=" ( # start match group 2 [^|]+ # any character that's not a "|", at least once ) # end match group 2 (?= # look-ahead: followed by \| # either a literal "|" | # or… $ # the end of the string ) # end look-ahead
您感兴趣的字符串部分分别位于匹配组1和2中。对我来说,上面的匹配:
Width
= 3/8 in
Length
= 1 in
Type
= Snap-On
Quantity
= 5000 per pack
Wt.
= 20 lb
您的示例在Thread - TPI or Pitch
案例中不一致。