我想做这样的事情! 所以我只剩下字符串的网站部分。我在字符串中引用时遇到了问题。
/////////////////////This is what i read into a string.
///<td width="118"><a href="research.html" class="navText style10 style12">
///////I wanna be able to parse this so i am only left with research.html
//I sometimes also get a string that contains:
//<a href="http://www.ucalgary.ca" class="style18"><font size="3">University of Calgary</font></a></div>
//From this string i wanna keep http://www.ucalgary.ca
到目前为止我所得到的并不总是适用于每一个案例。我很感激你的帮助!!我的代码是
public class Parse
{
public static void main(String[] args)
{
String h = "<a href=\"http://www.departmentofmedicine.com/policy.htm\">";
int n = getIndexOf(h, '"', 0);
String[] a = h.substring(n).split(">");
String url = a[0].replaceAll("\"", "");
//String value = a[1].replaceAll("</a", "");
System.out.println(url + " " );
}
public static int getIndexOf(String str, char c, int n)
{
int pos = str.indexOf(c, 0);
while (n-- > 0 && pos != -1)
{
pos = str.indexOf(c, pos + 1);
}
return pos;
}
}
答案 0 :(得分:0)
我会像这样尝试Pattern和Matcher:
String s = "<a href=\"http://www.departmentofmedicine.com/policy.htm\">";
Pattern p = Pattern.compile(".*href=\"([^\"]*).*");
Matcher m = p.matcher(s);
if(m.matches()) {
System.out.println(m.group(1));
}
答案 1 :(得分:0)
小代码:
字符串h =“http://www.departmentofmedicine.com/policy.htm\">”;
String url = h.substring(h.indexOf(“http”))。replace(“\”&gt;“,”“);
的System.out.println(URL);
输出将是: http://www.departmentofmedicine.com/policy.htm
在我的机器上测试过。
同时发布可能的案例。所以我可以告诉你更好的解决方案。
解决所有三个问题:
//String h1 = "<a href=\"http://www.departmentofmedicine.com/policy.htm\">";
//String h1 = `"<a href=\"ucalgary.ca\"; class=\"style18\"><font size=\"3\">University of Calgary</font></a>";
String h1="<td width=\"118\"><a href=\"research.html\" class=\"navText style10 style12\">";`
String url = h1.substring(h1.indexOf("href=\"") + "href=\"".length()).substring(0, h1.substring(h1.indexOf("href=\"") + "href=\"".length()).indexOf("\""));
System.out.println(url);
取消注释String h1;一个接一个地检查你的要求。
以上代码提供输出:
research.html
http://www.departmentofmedicine.com/policy.htm
ucalgary.ca