我需要使用Java客户端以编程方式登录网站。该网站没有任何API并使用表单身份验证。在我的客户端登录后,需要从不同URL下的同一网站下载文件。
我已尝试对第二个网址进行基本身份验证,但这不起作用。我还尝试实现类似于此示例的客户端:http://www.mkyong.com/java/how-to-automate-login-a-website-java-example/。它似乎没有获取我的客户端用于文件下载URL的会话。
以下是我想要实现的源代码:
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.DataOutputStream;
import java.io.File;
import java.io.FileWriter;
import java.io.InputStreamReader;
import java.net.CookieHandler;
import java.net.CookieManager;
import java.net.URL;
import java.net.URLEncoder;
import java.util.ArrayList;
import java.util.List;
import javax.net.ssl.HttpsURLConnection;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.parser.Parser;
import org.jsoup.select.Elements;
public class Client {
public static void main(String [] args) throws Exception {
String authURL = "https://hostname/webapp/loginUrl?parameters=values";
String fileURL = "https://hostname/webapp/downloadFile?parameters=values";
String username = "username";
String password = "password";
// downloading the login page
List<String> cookies = null;
CookieHandler.setDefault(new CookieManager());
URL url = new URL(authURL);
HttpsURLConnection conn = (HttpsURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setUseCaches(false);
conn.setRequestProperty("User-Agent", "Mozilla/5.0");
conn.setRequestProperty("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
conn.setRequestProperty("Accept-Language", "en-US,en;q=0.5");
BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String inputLine;
StringBuffer response = new StringBuffer();
while((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
cookies = conn.getHeaderFields().get("Set-Cookie");
String loginPage = response.toString();
// extracting form parameters
Document doc = Jsoup.parse(loginPage, "", Parser.xmlParser());
Element loginform = doc.getElementById("Logon");
Elements inputElements = loginform.getElementsByTag("input");
List<String> paramList = new ArrayList<String>();
for(Element inputElement: inputElements) {
String key = inputElement.attr("name");
String value = inputElement.attr("value");
if(key.equals("logonId")) {
value = username;
}
if(key.equals("logonPassword")) {
value = password;
}
paramList.add(key + "=" + URLEncoder.encode(value, "UTF-8"));
}
StringBuilder params = new StringBuilder();
for(String param: paramList) {
if(params.length() == 0) {
params.append(param);
} else {
params.append("&" + param);
}
}
// send post request to login
conn = (HttpsURLConnection) url.openConnection();
conn.setUseCaches(false);
conn.setRequestMethod("POST");
conn.setRequestProperty("User-Agent", "Mozilla/5.0");
conn.setRequestProperty("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
conn.setRequestProperty("Accept-Language", "en-US,en;q=0.5");
for(String cookie : cookies) {
conn.addRequestProperty("Cookie", cookie.split(";", 1)[0]);
}
conn.setRequestProperty("Connection", "keep-alive");
conn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
conn.setRequestProperty("Content-Length", Integer.toString(params.length()));
conn.setDoOutput(true);
conn.setDoInput(true);
DataOutputStream wr = new DataOutputStream(conn.getOutputStream());
wr.writeBytes(params.toString());
wr.flush();
wr.close();
// download the file
url = new URL(fileURL);
conn = (HttpsURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setUseCaches(false);
conn.setRequestProperty("User-Agent", "Mozilla/5.0");
conn.setRequestProperty("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
conn.setRequestProperty("Accept-Language", "en-US,en;q=0.5");
if(cookies != null) {
for(String cookie: cookies) {
conn.addRequestProperty("Cookie", cookie.split(";", 1)[0]);
}
}
in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
response = new StringBuffer();
while((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
cookies = conn.getHeaderFields().get("Set-Cookie");
// save the file to local disk
BufferedWriter bwr = new BufferedWriter(new FileWriter(new File("c:/temp/content")));
bwr.write(response.toString());
bwr.flush();
bwr.close();
}
}
此类情景的正确方法是什么?我的客户端应该捕获并维护页面创建的cookie /会话/会话令牌吗?
感谢。
答案 0 :(得分:0)
如果您需要会话或类似的东西取决于网站。因此,您应该首先检查网页真正做了什么。
我的常用方法是使用像Fiddler或Burp这样的网络代理来手动浏览页面时调查所有网络呼叫。然后我可以从代理中重做某些有趣的调用,逐步删除标题以查看真正需要的内容。 如果请求被条带化为最大值并且已识别出所有必需的头/数据,则可以使用Java重新编写调用。
使用Apache commons http和cookie manager将是我在Java端实现这一点的首选Java库(但这取决于网站的需求)。
如果网页非常复杂并且有时使用真实浏览器进行模糊处理是最简单的方法。您可以使用ui4j或Selenium(使用jBrowserDriver或Firefox作为引擎)等网页来远程控制网络浏览器并模拟冲浪用户。