我想使用JAVA登录v.qq.com。我可以使用 selenium 登录v.qq.com,但出于某种原因我必须放弃这种方式(selenium需要一个需要GTK的Chrome驱动程序和远程服务器上的许多东西)。然后我尝试htmlunit,它适用于其他视频网站,但不适用于此v.qq.com :(
selenium version:3.0.1
htmlunit version:2.24
这是我的selenium和htmlunit代码。 selenium功能在我的计算机上运行,而htmlunit则不行。希望得到一些帮助...
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlInput;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.google.gson.Gson;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import java.io.IOException;
import java.util.Set;
import java.util.concurrent.TimeUnit;
public class Test {
public static final Gson gson = new Gson();
public static void main(String[] args) throws InterruptedException, IOException {
String user = "3245845924";
String pass = "yoyo123456";
// Set<org.openqa.selenium.Cookie> cookies = selenium(user, pass);// success: 30+ Cookie
Set<com.gargoylesoftware.htmlunit.util.Cookie> cookies = htmlunit(user, pass);// fail: only 13 Cookie
System.out.println(cookies);
System.out.println(gson.toJson(cookies));
System.out.println(cookies.size());
}
public static Set<com.gargoylesoftware.htmlunit.util.Cookie> htmlunit(String user, String pass) throws IOException {
String url = "http://xui.ptlogin2.qq.com/cgi-bin/xlogin?appid=532001601&s_url=https%3A//v.qq.com";// also tried https
WebClient client = new WebClient(BrowserVersion.BEST_SUPPORTED);// BEST_SUPPORTED=CHROME, also tried FIREFOX_45, EDGE, INTERNET_EXPLORER and FIREFOX_38
client.setJavaScriptTimeout(1000 * 30);
client.getOptions().setTimeout(1000 * 30);
// client.getOptions().setCssEnabled(false);// also tried this
client.getOptions().setThrowExceptionOnFailingStatusCode(false);
client.getOptions().setThrowExceptionOnScriptError(false);
// client.setAjaxController(new NicelyResynchronizingAjaxController());// also tried this
HtmlPage page = client.getPage(url);
client.waitForBackgroundJavaScript(1000 * 20);
// page = page.getElementById("switcher_plogin").click();
HtmlInput userElement = (HtmlInput) page.getElementById("u");
userElement.type(user);// also tried setValueAttribute function
HtmlInput passElement = (HtmlInput) page.getElementById("p");
passElement.type(pass);
HtmlElement loginElement = (HtmlElement) page.getElementById("login_button");
System.out.println(client.getCookieManager().getCookies());
System.out.println(client.getCookieManager().getCookies().size());// size: 11
HtmlPage page2 = loginElement.click();
client.waitForBackgroundJavaScript(1000 * 30);
System.out.println(client.getCookieManager().getCookies());
System.out.println(client.getCookieManager().getCookies().size());// size: 13
client.close();
return client.getCookieManager().getCookies();
}
public static Set<org.openqa.selenium.Cookie> selenium(String user, String pass) throws InterruptedException {
String url = "https://xui.ptlogin2.qq.com/cgi-bin/xlogin?appid=532001601&&s_url=https%3A//v.qq.com";
System.setProperty("webdriver.chrome.driver", "/tmp/chromedriver");
ChromeDriver driver = new ChromeDriver();
driver.get(url);
TimeUnit.SECONDS.sleep(5);
WebElement element = driver.findElementById("switcher_plogin");
element.click();
element = driver.findElementById("u");
element.click();
driver.getKeyboard().pressKey(user);
element = driver.findElementById("p");
element.click();
driver.getKeyboard().pressKey(pass);
element = driver.findElementById("login_button");
element.click();
TimeUnit.SECONDS.sleep(10);
Set<org.openqa.selenium.Cookie> ret = driver.manage().getCookies();
// System.out.println(driver.getTitle());// success, go to home page, title: 腾讯视频-中国领先的在线视频媒体平台,海量高清视频在线观看
driver.close();
driver.quit();
return ret;
}
}
使用htmlunit时有一些日志。我认为这可能会有所帮助(虽然我无法理解这些日志):
2017-01-29 19:15:45,402 [org.apache.http.client.protocol.ResponseProcessCookies]-[WARN] Cookie rejected [pt_user_id="10515554044972903633", version:0, domain:ui.ptlogin2.qq.com, path:/, expiry:Wed Jan 27 19:15:45 CST 2027] Illegal 'domain' attribute "ui.ptlogin2.qq.com". Domain of origin: "xui.ptlogin2.qq.com"
2017-01-29 19:15:45,406 [org.apache.http.client.protocol.ResponseProcessCookies]-[WARN] Cookie rejected [ptui_identifier="000DF936CD35D39B19CDD2C005A3FF6C0CF1D3AF4EAD04B9B4EC43F3", version:0, domain:ui.ptlogin2.qq.com, path:/, expiry:null] Illegal 'domain' attribute "ui.ptlogin2.qq.com". Domain of origin: "xui.ptlogin2.qq.com"
2017-01-29 19:15:46,312 [com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl]-[WARN] Obsolete content type encountered: 'application/x-javascript'.
2017-01-29 19:15:46,677 [com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl]-[WARN] Obsolete content type encountered: 'application/x-javascript'.
2017-01-29 19:15:47,371 [com.gargoylesoftware.htmlunit.DefaultCssErrorHandler]-[WARN] CSS error: 'http://xui.ptlogin2.qq.com/cgi-bin/xlogin?appid=532001601&s_url=http%3A//v.qq.com' [1:4914] Error in style rule. (Invalid token "+". Was expecting one of: <EOF>, <S>, <IDENT>, "}", ";", "*".)
2017-01-29 19:15:47,371 [com.gargoylesoftware.htmlunit.DefaultCssErrorHandler]-[WARN] CSS warning: 'http://xui.ptlogin2.qq.com/cgi-bin/xlogin?appid=532001601&s_url=http%3A//v.qq.com' [1:4914] Ignoring the following declarations in this rule.
2017-01-29 19:15:47,735 [com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl]-[WARN] Obsolete content type encountered: 'application/x-javascript'.
2017-01-29 19:15:48,166 [com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl]-[WARN] Obsolete content type encountered: 'application/x-javascript'.
---更新2017.01.30 ---
最后,我通过另一个登录链接成功登录v.qq.com,但我使用的是phantomjs而不是HtmlUnit。您可以在评论中看到此链接(需要10个以上的信誉才能发布更多链接)。
但我仍然无法理解这些日志:(