如何使用C#在Html Agility Pack中使用查询参数更新表

时间:2018-03-09 07:37:01

标签: html html-parsing html-agility-pack

我用Html Agility Pack解析了这个Url:

http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_options.html"

显示的默认表格始终是最接近的合约日期和当前日期。

我在解析上面的完整页面时没有问题,但如果我要求另一个日期,当我添加查询参数以获取另一个日期时,我似乎无法获得新表:

例如。 http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_options.html?tradeDate=03/07/2018"

这仍会返回当前日期的表格。即。 2018年3月8日

但是,如果我为合约月添加另一个查询,它确实有效:

例如。 http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_options.html?optionExpiration=190-M18&tradeDate=03/07/2018"

但如果我再查询:

例如。 http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_options.html?optionExpiration=190-M18&tradeDate=03/06/2018"

....它不会给我03/06/2018的表格。

当我在Url中更改两个或更多查询参数时,它似乎只为我更新html 。我非常喜欢使用Html的Noob,所以我不确定它是否与实际网站有关并且阻止'我的请求。或者它是否期望一些用户互动'?

非常基本的'我的代码的核心是:

    using HtmlAgilityPack;

    HtmlDocument htmlDoc = new HtmlDocument { OptionFixNestedTags = true };         
    HtmlWeb web = new HtmlWeb();           
    htmlDoc = web.Load(url);

朝着正确方向迈出的一步将会很棒。

谢谢。

2 个答案:

答案 0 :(得分:0)

它是一个ajax网站。 WepPage包含通过过滤完成Ajax查询的JS。 因此,您不需要html-agility-pack,而需要JSON.NET

网址是: http://www.cmegroup.com/CmeWS/mvc/Settlements/Options/Settlements//190/OOF?monthYear=LOM18&strategy=DEFAULT&tradeDate=03/08/2018&pageSize=50&_=1520581128693

您需要构建url查询字符串,使用WebClient.DownloadString下载文本并使用JSON.NET将其转换为POCO。

答案 1 :(得分:0)

好的,所以我已经把这个答案发给了读这篇文章的其他人。请随时评论或编辑帖子。再次感谢Dovid的建议。我不能保证绝对的语法有效性,但它非常接近。代码以Json格式加载网页表,然后保存到文件中。还有一种从Json文件加载的方法。代码原样是'并不是复制和粘贴工作,只是对我如何做的参考。

using Newtonsoft;
using Newtonsoft.Json.Serialization;
using Newtonsoft.Json;
using Newtonsoft.Json.Converters;
using Newtonsoft.Json.Linq;

private string _jsonStr;
private string _tableUrlStr = "http://www.cmegroup.com/CmeWS/mvc/Settlements/Options/Settlements//190/OOF?monthYear=LOM18&strategy=DEFAULT&tradeDate=03/08/2018&pageSize=50&_=1520581128693";

using (WebClient wc = new WebClient)
{
    wc.BaseAddress = @"http://www.cmegroup.com/";
    wc.Headers[HttpRequestHeader.ContentType] = "application/json";
    wc.Headers[HttpRequestHeader.Accept] = "application/json";

    _jsonStr = wc.DownloadString(_tableUrlStr);
}

if (_jsonStr.IsNullOrEmpty())
    return; 

JObject jo = JObject.Parse(_jsonStr);

//## Add some more detail to the Json file.
jo.Add("instrumentName", "my instrument name");
jo.Add("contract", "my contract name");

//## For easier debugging but larger file size.
_jsonStr = jo.ToString(Formatting.Indented);


//## Json to file:
string path = directoryString + fileString + ".json";

if (!Directory.Exists(directoryString))
{
    Directory.CreateDirectory(directoryString);
}

if (File.Exists(path))
{        
    return;
}

using (FileStream fileStream = new FileStream(path, FileMode.CreateNew,  FileAccess.Write))
{   
     using (var streamWriter = new StreamWriter(fileStream, Encoding.UTF8))
    {
        streamWriter.WriteLine(_jsonStr);
        streamWriter.Close();
    }   
}

//## Json file to collection:
//## Can copy and paste your Json at 'www.json2csharp.com'.

public class Settlement
{
    public string strike { get; set; }
    public string type { get; set; }
    public string open { get; set; }
    public string high { get; set; }
    public string low { get; set; }
    public string last { get; set; }
    public string change { get; set; }
    public string settle { get; set; }
    public string volume { get; set; }
    public string openInterest { get; set; }
}

public class RootObject
{
    public List<Settlement> settlements { get; set; }
    public string updateTime { get; set; }
    public string dsHeader { get; set; }
    public string reportType { get; set; }
    public string tradeDate { get; set; }
    public bool empty { get; set; }
//## New added entries
    public string instrumentName { get; set; }
    public string contract { get; set; }
}

private static IEnumerable<Settlement> JsonFileToList(string directoryString, string fileString)
{
    if (directoryString == null)
    {
        return null;
    }

    string path = directoryString + fileString + ".json";

    if (!Directory.Exists(directoryString))
    {
        Directory.CreateDirectory(directoryString);
    }

    if (!File.Exists(path))
    {
        return null;
    }

    RootObject ro = JsonConvert.DeserializeObject<RootObject>(File.ReadAllText(path));

    var settlementList = ro.settlements;

    foreach (var settlement in settlementList)
    {
    //## Do something with this data.
        Console.Writeline(String.Format("Strike: {0}, Volume: {1}, Last: {2}", settlement.strike, settlement.volume, settlement.last));
    }

    return settlementList;

}