我需要一个java正则表达式来提取以下代码中脚本标记中的图像src。帮我解决.. 感谢
<script language="javascript"><!--
document.write('<a href="javascript:popupWindow(\'https://www.kitchenniche.ca/prepara-adjustable-oil-pourer-pi-5597.html?invis=0\')">
<img src="images/imagecache/prepara-adjustable-oil-pourer-1.jpg" border="0" alt="Prepara Adjustable Oil Pourer" title=" Prepara Adjustable Oil Pourer " width="170" height="175" hspace="5" vspace="5">
<br>
</a>');
--></script>
答案 0 :(得分:0)
试试这个:
String mydata = "<script language='javascript'><!--document.write('<a href='javascript:popupWindow"
+ "(\'https://www.kitchenniche.ca/prepara-adjustable-oil-pourer-pi-5597.html?invis=0\')'><img "
+ "src='images/imagecache/prepara-adjustable-oil-pourer-1.jpg' border='0' alt='Prepara Adjustable Oil Pourer' "
+ "title=' Prepara Adjustable Oil Pourer ' width='170' height='175' hspace='5' vspace='5'><br></a>');</script>";
Pattern pattern = Pattern.compile("src='(.*?)'");
Matcher matcher = pattern.matcher(mydata);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
答案 1 :(得分:0)
只有src
位于src
之后,此正则表达式才会找到<img
属性的内容。如果src
不是img标记的第一个属性,那么您需要更复杂的正则表达式。
public static void main(String[] args) {
String s = "<script language=\"javascript\"><!--\r\n"
+ " document.write('<a href=\"javascript:popupWindow(\\'https://www.kitchenniche.ca/prepara-adjustable-oil-pourer-pi-5597.html?invis=0\\')\">\r\n"
+ "<img src=\"images/imagecache/prepara-adjustable-oil-pourer-1.jpg\" border=\"0\" alt=\"Prepara Adjustable Oil Pourer\" title=\" Prepara Adjustable Oil Pourer \" width=\"170\" height=\"175\" hspace=\"5\" vspace=\"5\">\r\n"
+ "<br>\r\n" + "</a>');\r\n" + "--></script>";
Pattern pattern = Pattern.compile("<img src=\"([^\"]+)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
String group = matcher.group(1);
System.out.println(group);
}
}
([^\"]+)
表示匹配除"
之外的任何字符,并将匹配放入第1组。在java中,您必须转义"
。