我正在尝试在C#应用程序中进行一些抓取。
我正尝试在下一页上访问4条信息: https://smstestbed.nist.gov/vds/current
以下功能是我从远程加工工具轮询实时数据源的地方。 我的问题是,尽管我已经能够在终端上打印“ CreationTime”,但我的XPath使用却笨拙,据This Link似乎暗示我应该能够在我的评论后有2行
“ //这应该是访问数据的一种更好的方法,但是由于某些原因第二行失败”
不幸的是,我得到的 AvailabilityNode 为空。
public static void PollNIST()
{
string NISTSourceURL = "https://smstestbed.nist.gov/vds/current"; // Gives us a human friendly reference to the HTM
//-------------------------------- Current (mostly) Working Version---------------------------------------------------------------------------------
// Retrieve raw HTML
var NISTTargetURL = NISTSourceURL;
var NISTHttpClient = new HttpClient();
var NISTXMLRaw = NISTHttpClient.GetStringAsync(NISTTargetURL); // We now have all of the HTML / XML Data as a raw string
//Console.WriteLine(MazXMLRaw.Result); // Prints the resulting HTML to a terminal as a debug tool (Works)
XmlDocument CurNISTXML = new XmlDocument(); // Generate Blank XML Doc
CurNISTXML.LoadXml(NISTXMLRaw.Result); // This (".result") passes the actual string?, should then be loaded into new XML file
var elementHeader = CurNISTXML.GetElementsByTagName("Header");
var curNISTHeader = elementHeader.Item(0);
var creationTime = curNISTHeader.Attributes[0]; // We actually have the creationTime
string CurNISTTime = creationTime.InnerText; ; // //*[@id="mtconnect content"]/ul/li[1]
//This should be a far better way of accessing the data but for some reason the second line fails
XmlNode AvailabilityNode = CurNISTXML.SelectSingleNode("/table[1]/tbody/tr[1]"); //*[@id="mtconnect content"]/table[1]/tbody/tr[1]/td[7] // Xpath Availability
var CurNISTStatus = AvailabilityNode.InnerText; // //*[@id="mtconnect content"]/ul/li[1]
string CurNistX = ""; // //*[@id="mtconnect content"]/table[5]/tbody/tr/td[7]
string CurNistY = ""; // //*[@id="mtconnect content"]/table[6]/tbody/tr/td[7]
Console.WriteLine("-------BEGIN NIST DATA PACKET-------");
Console.WriteLine("NIST Time : " + creationTime.InnerText);
Console.WriteLine("NIST Status: " + CurNISTStatus);
Console.WriteLine("NIST X Pos.: " + CurNistX);
Console.WriteLine("NIST Y Pos.: " + CurNistY);
Console.WriteLine("--------END NIST DATA PACKET--------");
//var currentNIST = new NISTDataSet()// Create new instance ofNISTdata object
}
有什么想法吗?
答案 0 :(得分:1)
XPath表达式
/table[1]/tbody/tr[1]
table
元素时,才会成功,这似乎不太可能。我没有尝试了解页面或代码的逻辑,但这显然是错误的。路径表达式开头的“ /”从树的根中选择。
答案 1 :(得分:0)
因此,事实证明,我提取XML的方式并没有问题,仅使用Paths。
public static void PollNIST()
{
string NISTSourceURL = "https://smstestbed.nist.gov/vds/current"; // Gives us a human friendly reference to the HTMl
// string NistXmlUrl = // Someone on stackexchange is claiming that there is another url for the XML but viewsource says otherwise
//-------------------------------- Current (mostly) Working Version---------------------------------------------------------------------------------
var NISTHttpClient = new HttpClient();
var NISTXMLRaw = NISTHttpClient.GetStringAsync(NISTSourceURL); // We now have all of the HTML / XML Data as a raw string
//Console.WriteLine(MazXMLRaw.Result); // Prints the resulting HTML to a terminal as a debug tool (Works)
XmlDocument CurNISTXML = new XmlDocument(); // Generate Blank XML Doc
CurNISTXML.LoadXml(NISTXMLRaw.Result); // This (".result") passes the actual string?, should then be loaded into new XML file
// Get CreationTime (WORKING!)
XmlNodeList elementHeader = CurNISTXML.GetElementsByTagName("Header");
XmlNode curNISTHeader = elementHeader.Item(0);
XmlAttribute creationTime = curNISTHeader.Attributes[0]; // We now have the creationTime element
string CurNISTTime = creationTime.InnerText; // //*[@id="mtconnect content"]/ul/li[1]
// Get availability (WORKING!)
XmlNodeList nodeAvailability = CurNISTXML.GetElementsByTagName("Availability");
XmlNode availability = nodeAvailability.Item(0); // I think this is maybe a bit of a hackish / improper way to do this?
string curNISTStatus = availability.InnerText;
//Get linear tool X Coord.
XmlNodeList deviceStream = CurNISTXML.GetElementsByTagName("ComponentStream");
XmlNode linearCompXStream = deviceStream.Item(4);
string curNISTX = linearCompXStream.InnerText; // We do not need to break down the nodes any further as the value is the only text within
//Get Linear tool y Coord.
XmlNode linearCompYStream = deviceStream.Item(5);
string curNISTY = linearCompYStream.InnerText; // We do not need to break down the nodes any further as the value is the only text within
Console.WriteLine("-------BEGIN NIST DATA PACKET-------");
Console.WriteLine("NIST Time : " + creationTime.InnerText);
Console.WriteLine("NIST Status: " + curNISTStatus);
Console.WriteLine("NIST X Pos.: " + curNISTX);
Console.WriteLine("NIST Y Pos.: " + curNISTY);
Console.WriteLine("--------END NIST DATA PACKET--------");
//var currentNIST = new NISTDataSet()// Create new instance ofNISTdata object
}
运作良好。