我想知道什么样的编程语言可以帮助我“自动阅读”网站?例如,我希望能够用代码编写:使用此密码登录以进行堆栈溢出,如果此页面有任何更改,请发送邮件给我......
感谢阅读!
PS:我知道一些html和C ++
答案 0 :(得分:1)
好的,语言并不重要。如果使用Visual Basic for Windows构建应用程序,则可以自动执行浏览器对象,以完全按照导航时的操作执行操作。 为此我通常使用Java,有库(我个人喜欢com.gargoylesoftware.htmlunit.WebClient)。
示例:
final WebClient webClient = new WebClient();
webClient.setThrowExceptionOnFailingStatusCode(false);
webClient.setThrowExceptionOnScriptError(false);
//webClient.setAppletEnabled(false);
//webClient.setJavaScriptEnabled(false);
// Get the first page
final HtmlPage page1 = webClient.getPage("http://fist.page/address.html");
// Get the form that we are dealing with and within that form,
// find the submit button and the field that we want to change.
final HtmlForm form = page1.getFormByName("form1");
final HtmlSubmitInput button = form.getInputByName("send_button");
final HtmlTextInput input1 = form.getInputByName("input1");
final HtmlTextInput input2 = form.getInputByName("input2");
// Change the value of the text field
input1.setValueAttribute("I would type this");
input2.setValueAttribute("I would type that");
// Now submit the form by clicking the button and get back the second page.
final HtmlPage page2 = button.click();
在c ++中,这似乎是你需要的:
void ProgressTest(void)
{
// Set URL and call back function.
WinHttpClient client(L"http://www.codeproject.com/", ProgressProc);
client.SendHttpRequest();
wstring httpResponseHeader = client.GetResponseHeader();
wstring httpResponseContent = client.GetResponseContent();
}
来自:http://www.codeproject.com/Articles/66625/A-Fully-Featured-Windows-HTTP-Wrapper-in-C
答案 1 :(得分:1)
有一些C ++背景,学习Python会很快,所以我建议你试试MechanicalSoup,这是一个允许你自动化web动作的Python库。它基于已经未维护的 Mechanize