我是春天和jsoup的新手...我正在使用jsoup来解析一个html文件并在div标签中复制一些文本并将其显示在我的页面上。现在我正在尝试修改链接并添加exit.do以将用户登出服务器。我尝试了很多不同的方法,我的链接不起作用:(之前有没有人处理过这个链接更新?任何帮助都是适用的。
这是我的代码。
非常感谢。
洛拉
modelMap = referenceData( request, modelMap);
modelMap.put("externalUrl", externalUrlMap.get( request.getServletPath() ));
modelMap.put("elementId", elementIdMap.get( request.getServletPath() ));
/** Pass the url map to a string */
String url = (String) externalUrlMap.get( request.getServletPath() );
/** Pass the div map to a string */
String eleId = (String) elementIdMap.get( request.getServletPath() );
/** Retrieve and parse the document using Jsoup*/
//URL externalUrl = new URL(url);
//Document document = Jsoup.parse(externalUrl, 10000);
File internalFile = new File(url);
Document document = Jsoup.parse(internalFile, "UTF-8");
/** Clean the document to prevent XSS only include tags and style below */
//document = new Cleaner(Whitelist.basic().addTags("div", "em", "h1", "h2").addAttributes("div","class", "style")).clean(document);
/** Select privactText tags from the id */
Element divContent = document.select(eleId).first();
/** Returned the text inside the div tag */
String parsedExternalContent = divContent.html();
/** Get all links inside div tag */
Elements links = divContent.select("a[href]");
String exitUrl = "/exit?logout=true&uri=";
/** Loop through the links and if the links are relative path add the exit.do to the link */
for (Element link : links) {
if (!link.attr("href").toLowerCase().startsWith("http://")) {
String urltext = link.attr("href");
String exitText = "/exit?logout=true&uri=";
...
}
}
modelMap.addAttribute("parsedExternalContent", parsedExternalContent);
return new ModelAndView ("externalParserContent", modelMap);
答案 0 :(得分:0)
当我需要用“编码”网址重新编写原始字符串时,这就是我这样做的方式:
Document doc = getHtmlDocumentFromString(htmlOnly);
Elements links = doc.select("a[href]");
/**
* since we would want to track link index per click - iterate links in the old fashion way (Elements is a List<Element>)
*/
for(int linkIndexTopToBottom = 0; linkIndexTopToBottom < links.size(); linkIndexTopToBottom++){
try{
Element link = links.get(linkIndexTopToBottom);
if (!UriUtils.isValidUrl(link.attr("href")))
continue;
...
link.attr("href",<NEW URL>);
}catch (MalformedURLException exception){
log.debug("Provided URL was not valid: " + links.get(linkIndexTopToBottom).attr("abs:href") + ", skipping link re-write");
}
}
return doc;
如您所见,您需要设置如下属性:
link.attr("href", <NEW URL>);
由于您的帖子中缺少该部分,我不确定您是否这样做
修改强>
追加将是完全相同的想法:
link.attr("href", link.attr("href") + "<what you need to append with>");
底线是您需要将href
属性设置为新值
Example from the jSoup cook book