我有这个字符串:
<dependencies style="typed">
<dep type="dep">
<governor idx="1">Maria</governor>
<dependent idx="2">mrge</dependent>
</dep>
<dep type="dep">
<governor idx="2">mrge</governor>
<dependent idx="3">la</dependent>
</dep>
<dep type="dep">
<governor idx="1">Maria</governor>
<dependent idx="4">scoala</dependent>
</dep>
</dependencies>
我尝试通过它但是出现了这样的例外,我不知道如何解决它。
这是错误:
3:1: Content is not allowed in prolog.
org.xml.sax.SAXParseException; lineNumber: 3; columnNumber: 1; Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at versionTwo.Analyze.convertStringToDocument(Analyze.java:348)
at versionTwo.Analyze.depRel(Analyze.java:299)
at versionTwo.MainClass.main(MainClass.java:17)
Exception in thread "main" java.lang.NullPointerException
at versionTwo.Analyze.depRel(Analyze.java:300)
这是我的代码:
public String depRel(String graph) throws SAXException, IOException,
ParserConfigurationException {
String xmlString;
xmlString = Features.dependencyGraph(graph);
String result = "";
System.out.println("A value og dependency graph is;" + xmlString);
Document document = parseXmlFromString(xmlString);
document.getDocumentElement().normalize();
Element root = document.getDocumentElement();
NodeList nList = document.getElementsByTagName("dependencies");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node node = nList.item(temp);
if (node.getNodeType() == Node.ELEMENT_NODE) {
// Print each employee's detail
Element eElement1 = (Element) node;
}
NodeList nodesDocPart = node.getChildNodes();
for (int temp2 = 0; temp2 < nodesDocPart.getLength(); temp2++) {
Node n = nodesDocPart.item(temp2);
// /////////////////////////////////////////////////sentence/////////////////////////////////////////////
NodeList nodesSentencePart = n.getChildNodes();
for (int temp3 = 0; temp3 < nodesSentencePart.getLength(); temp3++) {
Node sentence = nodesSentencePart.item(temp3);
if (sentence.getNodeType() == Node.ELEMENT_NODE) {
Element eElement4 = (Element) sentence;
System.out.println("Sentence : "
+ eElement4.getTextContent());
result = eElement4.getTextContent() + "\n";
}
}
}
}
return result;
}
public Document parseXmlFromString(String xmlString)
throws ParserConfigurationException, SAXException, IOException {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputStream inputStream = new ByteArrayInputStream(xmlString.getBytes());
org.w3c.dom.Document document = builder.parse(inputStream);
return document;
}
这是我的方法,在解析一个句子之后从XML创建一个String。这个字符串我想在另一个类中读取,比如xml但我发布的底部错误出现了。任何想法?
public static String dependencyGraph(String s) {
Properties props = new Properties();
props.put("annotators",
"tokenize, ssplit, pos, lemma, ner, parse, dcoref,depparse");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Annotation document = new Annotation(s);
pipeline.annotate(document);
CoreMap sentence = document.get(
CoreAnnotations.SentencesAnnotation.class).get(0);
SemanticGraph dependency_graph = sentence
.get(SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation.class);
String newLine = System.getProperty("line.separator");
//convert the output format to a string
String graph = "\n\nDependency Graph: "
+ dependency_graph.toString(SemanticGraph.OutputFormat.XML)//save the answer like a String from the xml
+ newLine;
// System.out.println("The graph was made=>" + graph);
return graph;
}
public static String dependencyGraph(String s) {
Properties props = new Properties();
props.put("annotators",
"tokenize, ssplit, pos, lemma, ner, parse, dcoref,depparse");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Annotation document = new Annotation(s);
pipeline.annotate(document);
CoreMap sentence = document.get(
CoreAnnotations.SentencesAnnotation.class).get(0);
SemanticGraph dependency_graph = sentence
.get(SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation.class);
String newLine = System.getProperty("line.separator");
//convert the output format to a string
String graph = "\n\nDependency Graph: "
+ dependency_graph.toString(SemanticGraph.OutputFormat.XML)//save the answer like a String from the xml
+ newLine;
// System.out.println("The graph was made=>" + graph);
return graph;
}
答案 0 :(得分:0)
在dependencyGraph(String)中你做
String graph = "\n\nDependency Graph: "
+ dependency_graph.toString(SemanticGraph.OutputFormat.XML);
创建一个以两个换行符和文本“DependencyGraph”开头的字符串。
然后将其分配给变量:
String xmlString;
xmlString = Features.dependencyGraph(graph);
然后尝试将其解析为XML:
Document document = parseXmlFromString(xmlString);
但是以两个换行符和文本“Dependency Graph”开头的字符串不是格式良好的XML,因此XML解析器抱怨:在第3行第1列它发现了一些不能成为XML的序言的东西文档。
因此,标题问题的答案是:如果要将字符串解析为XML,则必须包含格式良好的XML。