我想下载网页源代码并获取json。
Here您可以切换到源代码,使用ctrl + F
并找到var data
这就是我需要的。
还有我的代码:
public class Parser {
static Pattern DATA_PATTERN = Pattern.compile("var data = (.*)");
public static void main(String[] args) throws IOException {
String webPage = new Parser().getUrlSource("http://satiksme.daugavpils.lv/tramvajs-nr-1-butlerova-iela-stacija");
if(webPage != null){
Matcher m = DATA_PATTERN.matcher(webPage);
if(m.find()) {
String extracted = m.group(1).trim();
System.out.println(extracted);
}
}
}
public String getUrlSource(String url) throws IOException {
URL yahoo = new URL(url);
URLConnection yc = yahoo.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(
yc.getInputStream(), "UTF-8"));
String inputLine;
StringBuilder a = new StringBuilder();
while ((inputLine = in.readLine()) != null)
a.append(inputLine);
in.close();
return a.toString();
}
}
问题是:Pattern.compile("var data = (.*)")
效果不佳。我想只有json
,没有额外的html标签。
现在实际结果是:
json +
$(document).ready(function () { $(".sations ul").html(""); var selst = window.location.hash.replace("#", ""); $.each(data.stations, function (index, val) { var cls = "even"; if (index % 2 == 0) cls = "odd"; $(".sations ul").append("<li class='" + cls + "' id='station-" + val.sid + "' onclick='return showStation(" + val.sid + ")'><span class='station-name'>" + val.name + "</span></li>"); if (index == 0) { if (!selst) selst = val.sid; } }); showStation(selst); initmap(defaultLat, defaultLng, defaultZoom); });</script></article></div> </div> </div></div></div><div id="layout-footer" class="group"> <footer id="footer"> <div id="footer-quad" class="group"> </div> <div id="footer-sig" class="group"> <div class="zone zone-footer"><div class="credits"><span class="copyright">Copyright © 2014 <b>SIA Daugavpils Satiksme</b>. All rightd reserved.</span><span class="poweredby">Izstrādāts <a href="http://www.latinsoft.lv" target="_blank">Latinsoft</a>. Izmantojot <a href="http://www.orchardproject.net" rel="nofollow" target="_blank">Orchard</a>.</span></div><div class="user-display"> <span class="user-actions"><a href="/Users/Account/LogOn?ReturnUrl=%2Ftramvajs-nr-1-butlerova-iela-stacija" rel="nofollow">Sign In</a></span></div></div> </div> </footer></div></div><script src="/Modules/Traffic/scripts/leaflet.js" type="text/javascript"></script><script src="/Modules/Traffic/scripts/dsapi.js" type="text/javascript"></script><script src="http://code.jquery.com/jquery-migrate-1.2.1.js" type="text/javascript"></script><script src="/Themes/TheThemeMachine/scripts/lispage.js" type="text/javascript"></script><script src="/Themes/TheThemeMachine/scripts/jquery.nivo.slider.js" type="text/javascript"></script></body></html>
预期结果:只有json。
P.S。这个Pattern
在Android中非常完美。也许有人可以解释我为什么?
谢谢!