monitorUrl-- http://host03:8810/solr/admin/stats.jsp
,其中包含此xml文件。
<?xml-stylesheet type="text/xsl" href="stats.xsl"?>
<solr>
<core></core>
<schema>test</schema>
<host>domain.host.com</host>
<now>Fri Nov 11 11:14:01 PST 2011</now>
<start>Thu Sep 22 18:33:06 PDT 2011</start>
<solr-info>
<CORE>
<entry>
<name>
core
</name>
<class>
</class>
<version>
1.0
</version>
<description>
SolrCore
</description>
<stats>
<stat name="coreName" >
</stat>
<stat name="startTime" >
Thu Sep 22 18:33:06 PDT 2011
</stat>
<stat name="refCount" >
2
</stat>
<stat name="aliases" >
[]
</stat>
</stats>
</entry>
<entry>
<name>
searcher
</name>
<class>
org.apache.solr.search.SolrIndexSearcher
</class>
<version>
1.0
</version>
<description>
index searcher
</description>
<stats>
<stat name="searcherName" >
Searcher@5b637a2d main
</stat>
<stat name="caching" >
true
</stat>
<stat name="numDocs" >
111959
</stat>
<stat name="maxDoc" >
112310
</stat>
<stat name="reader" >
DirectoryReader(segments_h0 _1zn:Cv101710/351 _1zl:Cv8026 _1zp:Cv2574)
</stat>
<stat name="readerDir" >
org.apache.lucene.store.NIOFSDirectory@/es_idx_prd/projects/index/solr-agile/document/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@2c164804
</stat>
<stat name="indexVersion" >
1313979005459
</stat>
<stat name="openedAt" >
Fri Nov 11 11:00:04 PST 2011
</stat>
<stat name="registeredAt" >
Fri Nov 11 11:00:04 PST 2011
</stat>
<stat name="warmupTime" >
0
</stat>
</stats>
</entry>
</solr-info>
</solr>
我想从上面的xml中提取 numDocs 值111959-- 111959 下面的fetchlog方法只是读取该jsp文件的每一行。那么如何通过逐行读取来直接获取numDocs值来检索numDocs值。
monitorUrl是一个xml格式的.jsp文件。
public void fetchlog() {
InputStream is = null;
FileOutputStream fos = null;
try {
is = HttpUtil.getFile(monitorUrl);
BufferedReader in =
new BufferedReader (new InputStreamReader (is));
String line;
while ((line = in.readLine()) != null) {
if(line.contains("numDocs")) {
//Extract numDocs value- How to do this?
}
System.out.println(line);
}
fos = new FileOutputStream(buildTargetPath());
IOUtils.copy(is, fos);
} catch (FileNotFoundException e) {
log.error("File Exception in fetching monitor logs :" + e);
} catch (IOException e) {
log.error("Exception in fetching monitor logs :" + e);
}
}
答案 0 :(得分:1)
您可以使用Dom4J(或任何XML)和XPATH:
import java.io.IOException;
import org.w3c.dom.*;
import org.xml.sax.SAXException;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
public class XPathExample {
public static void main(String[] args)
throws ParserConfigurationException, SAXException,
IOException, XPathExpressionException {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("books.xml");
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr
= xpath.compile("//numDocs");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
}
}
http://www.ibm.com/developerworks/library/x-javaxpathapi/index.html
答案 1 :(得分:1)
要跟进我的评论,如果XML /文本结构相同,那么你可以这样做,
if(line.contains("numDocs")) {
//Extract numDocs value- How to do this?
String numDocs = in.readLine(); // May need trimming.
System.out.println("Num docs:" + numDocs);
}