这是一个想法:
有点像您可以搜索Google桌面应用程序而不是浏览器?
我只需要朝着正确的方向发展。 (也许我应该寻找某种方法)我对Java API不太熟悉。
答案 0 :(得分:1)
您可以使用Java的标准HttpURLConnection来搜索内容。然后解析响应所需的全部是Apache tika,用于从HTML页面中提取文本。
以下是使用Url Connection的简单示例:
import java.io.BufferedReader;
import java.io.DataOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.ProtocolException;
import java.net.URL;
import java.net.URLEncoder;
public class SimpleHTTPRequest {
/**
* @param args
*/
public static void main(String[] args) {
HttpURLConnection connection = null;
DataOutputStream wr = null;
BufferedReader rd = null;
StringBuilder sb = null;
String line = null;
URL serverAddress = null;
try {
serverAddress = new URL("http://www.google.com/search?q=test");
//set up out communications stuff
connection = null;
//Set up the initial connection
connection = (HttpURLConnection)serverAddress.openConnection();
connection.setRequestMethod("GET");
connection.setDoOutput(true);
connection.setDoInput(true);
connection.setUseCaches(false);
connection.setRequestProperty ( "Content-type","text/xml" );
connection.setAllowUserInteraction(false);
String strData = URLEncoder.encode("test","UTF-8");
connection.setRequestProperty ( "Content-length", "" + strData.length ());
connection.setReadTimeout(10000);
connection.connect();
//get the output stream writer and write the output to the server
//not needed in this example
wr = new DataOutputStream(connection.getOutputStream());
wr.writeBytes("q="+strData);
wr.flush();
//read the result from the server
rd = new BufferedReader(new InputStreamReader(connection.getInputStream()));
sb = new StringBuilder();
while ((line = rd.readLine()) != null)
{
sb.append(line + '\n');
}
System.out.println(sb.toString());
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (ProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
finally
{
//close the connection, set all objects to null
connection.disconnect();
rd = null;
sb = null;
wr = null;
connection = null;
}
}
}
而here您找到了使用apache tika
提取文本的示例答案 1 :(得分:0)
您必须了解Java套接字编程以及Web服务器的工作原理。除此之外,您还可以使用HttpURLConnection
类建立与Web服务器的连接,并可以下载内容。
http://docs.oracle.com/javase/1.4.2/docs/api/java/net/HttpURLConnection.html
答案 2 :(得分:0)
您可以使用开源库Apache Http components。这简化了工作。
答案 3 :(得分:0)
您必须使用网址类与网络连接。
例如
url1 = new URL(url);
InputStream input=url1.openStream();
BufferedInputStream bis=new BufferedInputStream(input);
dis=new DataInputStream(bis);
// byte[] buffer=new byte[1000];
String data="";
while(dis.available()!=0)
{
data+=dis.readLine();
}
jobj=new JSONObject(data);