Question

我正在尝试编写将返回java中HTML标记中的值的java代码。以下是我一直努力工作的方法..有人可以帮帮我吗

import java.util.regex.Matcher;
import java.util.regex.Pattern;

import com.seoreport.exceptions.DataNotFoundException;

public class utils {

    public String tagValue(String inHTML, String tag) throws DataNotFoundException
    {
        String value = null;

        String searchFor = "/<" + tag + ">(.*?)<\\/" + tag + "\\>/";

        Pattern pattern = Pattern.compile(searchFor);
        Matcher matcher = pattern.matcher(inHTML);

        return value;

    }

}

Answer 1

为什么不尝试使用XML解析器并使用xpath访问块？你可以这样做：

// Parse the XML file and build the Document object in RAM
Document doc = docBuilder.parse(new File(fileName));

// Normalise text representation.
// Collapses adjacent text nodes into one node.
doc.getDocumentElement().normalize();

// get tag
xpath = ".//*/"+yourTag;
NodeList content= XPathAPI.selectNodeList(doc, xpath);

通过这种方式，您将获得内容变量中的所有内容。

您可以使用以下方式将其用作文字：

content.tostring();

在Java中返回HTML标记值

1 个答案: