我有一段XML如下:
示例1:
<explanation>
<NodeExplanations>
<IDAfterSkipProcessing>/Return/ReturnData/PPStudentLoanInterestWks/StudentLoanInterestDeductionAmtPP</IDAfterSkipProcessing>
<NodeExplanation>
<ID>/Return/ReturnData/PPStudentLoanInterestWks/StudentLoanInterestDeductionAmtPP</ID>
<SkippedToIDForExplanationData>/Return/ReturnData/PPStudentLoanInterestWks/StudentLoanInterestDeductionAmtPP</SkippedToIDForExplanationData>
<Value>2000</Value>
<Gist>Difference</Gist>
<Scenario>DIFFERENCE</Scenario>
<Title>StudentLoanInterestDeductionAmtPP</Title>
<Phrase>
<Text>StudentLoanInterestDeductionAmtPP</Text>
</Phrase>
<Question>
<Text>StudentLoanInterestDeductionAmtPP</Text>
</Question>
<ExplanationText>
<NodeName>
<Text>StudentLoanInterestDeductionAmtPP</Text>
</NodeName>
<Text> comes from subtracting </Text>
<InputName>
<Text>MultiplyLine2byLine6AmtPP</Text>
</InputName>
<Text> from </Text>
<InputName>
<Text>SmallerOfLine1OrLimitAmtPP</Text>
</InputName>
<Text>.</Text>
<BulletedList>
<ListEntry>
<InputLink>
<Ref>/Return/ReturnData/PPStudentLoanInterestWks/SmallerOfLine1OrLimitAmtPP</Ref>
<LinkText>
<Text>SmallerOfLine1OrLimitAmtPP</Text>
</LinkText>
</InputLink>
</ListEntry>
<ListEntry>
<InputLink>
<Ref>/Return/ReturnData/PPStudentLoanInterestWks/MultiplyLine2byLine6AmtPP</Ref>
<LinkText>
<Text>MultiplyLine2byLine6AmtPP</Text>
</LinkText>
</InputLink>
</ListEntry>
</BulletedList>
</ExplanationText>
<InputNodes>
<InputNodeEntry>
<ID>/Return/ReturnData/PPStudentLoanInterestWks/QualifiedStudentLoanPP</ID>
<Role>BlankIfFalse</Role>
<Value>true</Value>
<Type>CALCULATED_NODE</Type>
<HasSubExplanations>false</HasSubExplanations>
</InputNodeEntry>
<InputNodeEntry>
<ID>/Return/ReturnData/PPStudentLoanInterestWks/MultiplyLine2byLine6AmtPP</ID>
<Role>Right</Role>
<Value>0</Value>
<Type>CALCULATED_NODE</Type>
<HasSubExplanations>false</HasSubExplanations>
</InputNodeEntry>
<InputNodeEntry>
<ID>/Return/ReturnData/PPStudentLoanInterestWks/SmallerOfLine1OrLimitAmtPP</ID>
<Role>Left</Role>
<Value>2000</Value>
<Type>CALCULATED_NODE</Type>
<HasSubExplanations>false</HasSubExplanations>
</InputNodeEntry>
</InputNodes>
<Children>
<ID>/Return/ReturnData/PPStudentLoanInterestWks/QualifiedStudentLoanPP</ID>
<ID>/Return/ReturnData/PPStudentLoanInterestWks/MultiplyLine2byLine6AmtPP</ID>
<ID>/Return/ReturnData/PPStudentLoanInterestWks/SmallerOfLine1OrLimitAmtPP</ID>
</Children>
</NodeExplanation>
</NodeExplanations>
</explanation>
示例2:
<ExplanationText>
<Text>We can't get any more details on </Text>
<NodeName>
<Text>QualifiedStudentLoansInterestAmtPP</Text>
</NodeName>
<Text> right now.</Text>
</ExplanationText>
示例3:
<ExplanationText>
<Text>We can't get any more details on </Text>
<NodeValue>
<Value>123</Value>
</NodeName>
<Text> right now.</Text>
</ExplanationText>
示例4:
<ExplanationText>
<NodeName>
<Text>Your Earned Income Credit of </Text>
<NodeValue>
<Currency>156</Currency>
</NodeValue>
</NodeName>
<Text> comes from </Text>
<InputValue>
<Currency>156</Currency>
</InputValue>
<Text>.</Text>
</ExplanationText>
我想从这些XML中的所有Text和Value标签中获取所有文本,但我想忽略BulletedList标记下的所有内容。我想要其他一切。我怎样才能在Java中实现这一目标?
这是我目前的实施:
public static String getExplanationTextFromResponse(String pathToExpFile, String nodeID) {
File f = new File(pathToExpFile);
String text = new String();
Map<String,String> listOfExpText = getExplanationText(f);
text = listOfExpText.get(nodeID);
return text;
}
public static Map<String,String> getExplanationText(File pathToExpFile) {
Map<String,String> map = new LinkedHashMap<String, String>();
String nodeExplanationXPATH = "/explanation/NodeExplanations/NodeExplanation";
String explanationTextXPATH = "ExplanationText//Text/text()";
String id = new String();
String text = new String();
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
String xml = FileUtils.readFileToString(pathToExpFile);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new StringReader(xml)));
NodeList nlNodeExplanationList = doc.getElementsByTagName("NodeExplanation");
for(int i=0;i<nlNodeExplanationList.getLength();i++) {
Node explanationNode = nlNodeExplanationList.item(i);
List<String> idList = getTextValuesByTagName((Element)explanationNode, "ID");
id = idList.get(0);
}
XPath xpath = XPathFactory.newInstance().newXPath();
Object object = new Object();
object = xpath.evaluate(nodeExplanationXPATH, doc, XPathConstants.NODESET);
NodeList explanations = (NodeList) object;
int count = explanations.getLength();
for(int i=0;i<count;i++) {
Object obj = new Object();
Node explanation = explanations.item(i);
obj = xpath.evaluate(explanationTextXPATH, explanation, XPathConstants.NODESET);
NodeList explanationText = (NodeList) obj;
text = (joinNodeSetText(explanationText));
logger.debug(joinNodeSetText(explanationText));
}
map.put(id, text);
return map;
}
catch (IOException e) {
e.printStackTrace();
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (XPathExpressionException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
}
return null;
}
/**
* Joins the .getTextContent() of each node in the nodeSet concatenated into one string.
*
* @param nodeSet
* @return
*/
static String joinNodeSetText(NodeList nodeSet) {
StringBuilder builder = new StringBuilder();
logger.debug("Combining Text nodes...");
for(int i=0;i<nodeSet.getLength();i++) {
builder.append(nodeSet.item(i).getTextContent());
}
logger.debug("Combination complete!");
return builder.toString();
}
修改
当我使用'// Text [not(ancestor :: BulletedList)|)时,我得到了以下的结果: //值[not(ancestor :: BulletedList)]':
javax.xml.transform.TransformerException: Expected ], but found:
at org.apache.xpath.compiler.XPathParser.error(XPathParser.java:610)
at org.apache.xpath.compiler.XPathParser.consumeExpected(XPathParser.java:528)
at org.apache.xpath.compiler.XPathParser.Predicate(XPathParser.java:1937)
at org.apache.xpath.compiler.XPathParser.Step(XPathParser.java:1726)
at org.apache.xpath.compiler.XPathParser.RelativeLocationPath(XPathParser.java:1626)
at org.apache.xpath.compiler.XPathParser.LocationPath(XPathParser.java:1597)
at org.apache.xpath.compiler.XPathParser.PathExpr(XPathParser.java:1317)
at org.apache.xpath.compiler.XPathParser.UnionExpr(XPathParser.java:1236)
at org.apache.xpath.compiler.XPathParser.UnaryExpr(XPathParser.java:1142)
at org.apache.xpath.compiler.XPathParser.MultiplicativeExpr(XPathParser.java:1063)
at org.apache.xpath.compiler.XPathParser.AdditiveExpr(XPathParser.java:1005)
at org.apache.xpath.compiler.XPathParser.RelationalExpr(XPathParser.java:930)
at org.apache.xpath.compiler.XPathParser.EqualityExpr(XPathParser.java:870)
at org.apache.xpath.compiler.XPathParser.AndExpr(XPathParser.java:834)
at org.apache.xpath.compiler.XPathParser.OrExpr(XPathParser.java:807)
at org.apache.xpath.compiler.XPathParser.Expr(XPathParser.java:790)
at org.apache.xpath.compiler.XPathParser.initXPath(XPathParser.java:129)
at org.apache.xpath.XPath.<init>(XPath.java:178)
at org.apache.xpath.XPath.<init>(XPath.java:266)
at org.apache.xpath.jaxp.XPathImpl.eval(XPathImpl.java:195)
at org.apache.xpath.jaxp.XPathImpl.evaluate(XPathImpl.java:281)
at com.generalatomics.ctg.engine.automation.tools.contentutils.calc.ExplanationResponseReader.getExplanationText(ExplanationResponseReader.java:398)
at com.generalatomics.ctg.engine.automation.tools.contentutils.calc.ExplanationResponseReader.getExplanationTextFromResponse(ExplanationResponseReader.java:349)
at com.generalatomics.ctg.engine.automation.tools.contentutils.calc.ExplanationResponseReader.t(ExplanationResponseReader.java:338)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
at org.testng.TestRunner.privateRun(TestRunner.java:767)
at org.testng.TestRunner.run(TestRunner.java:617)
at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:329)
at org.testng.SuiteRunner.privateRun(SuiteRunner.java:291)
at org.testng.SuiteRunner.run(SuiteRunner.java:240)
at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
at org.testng.TestNG.runSuitesSequentially(TestNG.java:1224)
at org.testng.TestNG.runSuitesLocally(TestNG.java:1149)
at org.testng.TestNG.run(TestNG.java:1057)
at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111)
at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204)
at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175)
答案 0 :(得分:3)
如果您使用XPath表达式//Text[not(ancestor::BulletedList)] | //Value[not(ancestor::BulletedList)]
,则选择不在Text
元素内的所有Value
和BulletedList
元素。
如果您要在ExplanationText
元素内搜索,请使用//ExplanationText//Text[not(ancestor::BulletedList)] | //ExplanationText//Value[not(ancestor::BulletedList)]
。