Question

我需要使用两个引号之间存在的文本并使用它。我怎样才能做到这一点？我知道Regexp是解决方案，但我从未使用过。这是我的html代码，我在发出http请求后收到它：

<p>To reset your password, please follow this link: <a href="http://my.code.com/admin/resetPassword/JaI94">reset password</a>.</p>

我想获取URL并将其放在变量中，如下所示：

String url = "http://my.code.com/admin/resetPassword/JaI94";

谢谢你，抱歉这个坏问题！我不明白为什么我有这么多不喜欢的东西。

Answer 1

有很多方法可以做到这一点。最简单的，如果你知道线上只有一对引号，可能是这样的：

String [] tokens = line.split ("\"");
String url = tokens[1];  // also, tokens[0] = the text before the first quote, 
                         // and tokens[2] is the text after the second quote

您也可以通过使用indexOf查找引号索引并使用substring获取它们之间的文本，或使用正则表达式来执行此操作。

Answer 2

您可以定义正则表达式：

String testString = "<p>To reset your password, please follow this link: <a href=\"http://my.code.com/admin/resetPassword/JaI94\">reset password</a>.</p>";
String regex = "\"([^\"]*)\"";
Pattern pat = Pattern.compile(regex);
Matcher m = pat.matcher(testString);
if(m.find()) {
    System.out.println(m.group(1));
}

正则表达式将匹配第一个和第二个"

之间的任何内容

Answer 3

使用DocumentBuilder的示例（片段必须是有效的XHTML）：

    String fragment = "<p>To reset your password, please follow this link: <a href=\"http://my.code.com/admin/resetPassword/JaI94\">reset password</a>.</p>";

    String link = DocumentBuilderFactory
            .newInstance()
            .newDocumentBuilder()
            .parse(new ByteArrayInputStream(fragment.getBytes("utf-8")))
            .getDocumentElement()
            .getElementsByTagName("a")
            .item(0)
            .getAttributes()
            .getNamedItem("href")
            .getTextContent();

它很冗长，但它也不受杂散引号和其他属性和标签的影响（如果你的输入片段变化不可预测）。您需要适当地处理异常。

如何使用Regex在Java中引用引号之间的文本？

3 个答案: