网站登录进行数据抓取

时间:2013-10-14 18:18:33

标签: c# web login

我正在尝试从各种远程发射器进行网络刮擦日期。我有一个品牌的发射器,我可以使用以下c#代码登录:

public static string getSourceCode(string url, string user, string pass)
{
    SecureString pw = new SecureString();
    foreach (char c in pass.ToCharArray()) pw.AppendChar(c);
    NetworkCredential credential = new NetworkCredential(user, pw, url);
    CredentialCache cache = new CredentialCache();
    cache.Add(new Uri(url), "Basic", credential);
    Uri realLink = new Uri(url);
    HttpWebRequest req = (HttpWebRequest)WebRequest.Create(realLink);
    req.Credentials = CredentialCache.DefaultNetworkCredentials;

    HttpWebResponse resp = (HttpWebResponse)req.GetResponse();

    StreamReader sr = new StreamReader(resp.GetResponseStream());
    string sourceCode = sr.ReadToEnd();
    sr.Close();
    resp.Close();
    return sourceCode;
}

第二个品牌的发射器(我不愿意将公共网址公开)而不是返回请求用户名和密码的网页返回一个请求用户名和密码的框。使用上面的代码只会返回一个未经授权的错误。

Fiddler说当我成功登录网站时会发送以下内容:

GET http(colon slash slash)lasvegas3abn(*)dyndns(*)tv(PORT)125(slash)measurements(*)htm HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Accept-Language: en-US
User-Agent: Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0; Touch)
Accept-Encoding: gzip, deflate
Host: lasvegas3abn.dyndns.tv:125
Authorization: Basic dXNlcjpsaW5lYXI=
Connection: Keep-Alive
DNT: 1

有什么建议吗?

1 个答案:

答案 0 :(得分:2)

而不是:

req.Credentials = CredentialCache.DefaultNetworkCredentials;

您可以指定使用特定用户名和密码的凭据:

req.Credentials = new NetworkCredential("username", "password");

这应该使您能够通过登录提示(假设您指定了正确的用户名和密码)。