如何使用Notepad ++从XML格式的数据中提取Url?

时间:2018-05-22 13:24:04

标签: regex xml

我在config.xml文件中有几百个站点的列表。根据下面列出的绑定标记,我需要提取没有端口80的所有网址。是否有最准确的正则表达式语法可以创建这样的列表?我正在使用Notepad ++进行编辑。

<binding protocol="http" bindingInformation="217.145.55.21:80:rwasianew.inforce.dk" />
<binding protocol="http" bindingInformation="217.145.55.86:80:rwasianew.inforce.dk" />
<binding protocol="http" bindingInformation="*:80:rwasianew-cn.inforce.dk" />
<binding protocol="http" bindingInformation="*:80:rwasianew-th.inforce.dk" />
<binding protocol="http" bindingInformation="217.145.55.86:80:rwasianew-splash.inforce.dk" />
<binding protocol="http" bindingInformation=":80:rwbuilddeskgb.synkronvia.com" />
<binding protocol="http" bindingInformation=":80:rwbuilddeskdk.synkronvia.com" />
<binding protocol="http" bindingInformation=":80:rwbuilddesknl.synkronvia.com" />
<binding protocol="http" bindingInformation=":80:rwbuilddeskde.synkronvia.com" />
<binding protocol="http" bindingInformation=":80:rwbuilddeskint.synkronvia.com" />
<binding protocol="http" bindingInformation=":80:rwbuilddeskpl.synkronvia.com" />

2 个答案:

答案 0 :(得分:2)

不是正则表达式而是XSLT

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="text" indent="no" encoding="utf-8" media-type="text/plain"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="//binding">
    <xsl:value-of select="substring-after(substring-after(@bindingInformation, ':'), ':')"/>
    <xsl:text>&#10;</xsl:text>
  </xsl:template>

</xsl:stylesheet>

<强>输出

rwasianew.inforce.dk
rwasianew.inforce.dk
rwasianew-cn.inforce.dk
rwasianew-th.inforce.dk
rwasianew-splash.inforce.dk
rwbuilddeskgb.synkronvia.com
rwbuilddeskdk.synkronvia.com
rwbuilddesknl.synkronvia.com
rwbuilddeskde.synkronvia.com
rwbuilddeskint.synkronvia.com
rwbuilddeskpl.synkronvia.com

有多种在线服务,例如this一种。

答案 1 :(得分:0)

CTRL + H

搜索模式:Regular expression

查找:<binding protocol=".*" bindingInformation=".*:80:(.*)" \/>

替换:\1