我想知道是否可以使用HtmlUnit搜索YouTube。我开始编写代码,这里是:
import java.io.IOException;
import java.net.MalformedURLException;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlForm;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlSubmitInput;
public class HtmlUnitExampleTestBase {
private static final String YOUTUBE = "http://www.youtube.com";
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException {
WebClient webClient = new WebClient();
webClient.setThrowExceptionOnScriptError(false);
//This is equivalent to typing youtube.com to the adress bar of browser
HtmlPage currentPage = webClient.getPage("http://www.youtube.com");
//Get form where submit button is located
HtmlForm searchForm = (HtmlForm) currentPage.getElementById("masthead-search");
//Printing result form
System.out.println(searchForm.asText());
final List<HtmlAnchor> listLinks = (List<HtmlAnchor>) newPage.getByXPath("//a[@class='ux-thumb-wrap result-item-thumb']");
for (int i=0; i<listLinks.size(); i++){
System.out.println(YOUTUBE + listLinks.get(i).getAttribute("href"));
}
}
}
现在我不知道如何在搜索字段中键入一些文本,然后按“搜索”按钮。
我看过有关HtmlUnit的教程,但我遇到了问题,因为他们使用名为getElementByName
的方法,但YouTube上的搜索按钮没有名称,只有id。有人能帮助我吗?
编辑:我编辑了代码上面的代码,现在我从第一页获得了youtube链接。但在此之前,我需要按上传日期排序,然后抓取链接。有人可以帮我做排序吗?
答案 0 :(得分:3)
我不是HtmlUnit专家,但有一种解决方法。您可以将自己的按钮添加到表单并使用它来提交表单。
以下是带注释的代码示例:
import java.io.IOException;
import java.net.MalformedURLException;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlButton;
import com.gargoylesoftware.htmlunit.html.HtmlForm;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;
public class HtmlUnitExampleTestBase {
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException {
WebClient webClient = new WebClient();
webClient.setThrowExceptionOnScriptError(false);
// This is equivalent to typing youtube.com to the adress bar of browser
HtmlPage currentPage = webClient.getPage("http://www.youtube.com");
// Get form where submit button is located
HtmlForm searchForm = (HtmlForm) currentPage.getElementById("masthead-search");
// Get the input field.
HtmlTextInput searchInput = (HtmlTextInput) currentPage.getElementById("masthead-search-term");
// Insert the search term.
searchInput.setText("Nyan Cat");
// Workaround: create a 'fake' button and add it to the form.
HtmlButton submitButton = (HtmlButton) currentPage.createElement("button");
submitButton.setAttribute("type", "submit");
searchForm.appendChild(submitButton);
// Workaround: use the reference to the button to submit the form.
HtmlPage newPage = submitButton.click();
System.out.println(newPage.asText());
}
}
答案 1 :(得分:1)
HtmlUnit没问题,但我非常希望Watir或Selenium用于网络自动化。
HtmlUnit的一个缺点是缺乏以类似jQuery的方式获取DOM元素的选择器方法。查看css选择器项目,该项目将添加到HtmlUnit以帮助您轻松完成所需的操作。在Gooder Code有一个介绍。
一旦你开始工作,YouTube搜索表单的选择器将是“.search-term”,提交按钮的选择器将是“.search-button”