Question

我正在尝试从这个网站下载电影标题，日期和长度列表http://www.fancast.com/movies我的代码是：

// used to build entire input
        StringBuilder sb = new StringBuilder();

        // used on each read operation
        byte[] buf = new byte[8192];

        // prepare the web page we will be asking for
        HttpWebRequest request = (HttpWebRequest)
            WebRequest.Create("http://www.fancast.com/movies");

        // execute the request
        HttpWebResponse response = (HttpWebResponse)
            request.GetResponse();

        // we will read data via the response stream
        Stream resStream = response.GetResponseStream();

        string tempString = null;
        int count = 0;

        do
        {
            // fill the buffer with data
            count = resStream.Read(buf, 0, buf.Length);

            // make sure we read some data
            if (count != 0)
            {
                // translate from bytes to ASCII text
                tempString = Encoding.ASCII.GetString(buf, 0, count);

                // continue building the string
                sb.Append(tempString);
            }
        }
        while (count > 0); // any more data to read?

借用了我在网上找到的一些示例代码。但是，当我查看它下载的内容时，它不包含我要查找的信息。它与网站的“查看源”具有相同的信息。它似乎正在打电话给另一个有信息的网站，但我似乎无法找到或访问它。任何有关如何获取电影标题，长度和/或日期列表的帮助将非常感激。谢谢！

Answer 1

确切地说，如果您分析该网页的源代码，您会看到电影是从其他网址加载的。使用谷歌浏览器开发人员工具（或任何其他工具，如我真正推荐的“Fiddler2”）来跟踪浏览器在显示网页时下载的所有资源。

我做到了，好像是从http://www.fancast.com/movies_free_db.widget

抓取了电影数据库

因此，将WebRequest更改为指向该URL。

Answer 2

嗯......你在那里打开了一大堆蠕虫。

您的评论结果，“...包含与查看来源相同的信息...”，这让我觉得您并不完全了解正在发生的事情的详细信息。

我建议HTTP Programming Recipes for C#。这是我第一次写网络蜘蛛时读到的那本书，我认为这会给你一个很好的推动方向。

试图下载电影列表

2 个答案: