Android解析html保存href链接

时间:2015-04-09 14:39:52

标签: android html parsing hyperlink href

我有一个html字符串,如:

一个链接是sajfhds iufl

如何将此html字符串转换为包含链接但不包含任何html标记的字符串: 结果应该是:

一个链接是http://image.html

2 个答案:

答案 0 :(得分:0)

String regex = "^(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";

String regex = "\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";

String regex = "<\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]>"; // matches <http://google.com>

String regex = "<^(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]>"; // does not match <http://google.com>

答案 1 :(得分:0)

你将拥有类似这样的字符串

One link is <a href="http://image.html">sajfhds iufl</a>

您需要的是

One link is <a href="http://image.html">http://image.html</a>

所以,你应该做的是使用下面的代码找到模式

//imports required
import java.util.regex.Matcher;
import java.util.regex.Pattern;

        String stringToSearch = "<a href = \"http://image.html\" > sajfhds iufl</a>";

        // the pattern we want to search for
        Pattern p = Pattern.compile("<a href\\s*=\\s*\"(.+?)\"\\s*>(.+?)</a>");
        Matcher m = p.matcher(stringToSearch);

        if (m.find())
        {
          String temp = stringToSearch.replace(m.group(2), m.group(1)); 
          //use the temp string for display
        }