从网页获取文本到字符串

时间:2013-01-19 19:28:51

标签: android string text webpage

我是Android新手,我希望将网页中的整个文本转换为字符串。我发现了很多像这样的问题,但正如我所说,我是Android新手,我不知道如何在我的应用程序中使用它们。我收到了错误。只有一种方法我设法使它工作,它使用WebView和JavaScript,它很慢,因为地狱。有人可以告诉我一些其他的方法来做到这一点或如何加快WebView,因为我根本不使用它来查看内容。 顺便说一下,我添加了以下代码来加速WebView

webView.getSettings().setJavaScriptEnabled(true); 
    webView.getSettings().setBlockNetworkImage(true);
    webView.getSettings().setJavaScriptCanOpenWindowsAutomatically(false);
    webView.getSettings().setPluginsEnabled(false);
    webView.getSettings().setSupportMultipleWindows(false);
    webView.getSettings().setSupportZoom(false);
    webView.getSettings().setSavePassword(false);
    webView.setVerticalScrollBarEnabled(false);
    webView.setHorizontalScrollBarEnabled(false);
    webView.getSettings().setAppCacheEnabled(false);
    webView.getSettings().setCacheMode(WebSettings.LOAD_NO_CACHE);

如果您知道其他比使用WebView更好更快的解决方案,请告诉我主要活动的完整源代码或解释我应该写的地方,这样我就不会出错。

3 个答案:

答案 0 :(得分:27)

使用此:

public class ReadWebpageAsyncTask extends Activity {
    private TextView textView;

    /** Called when the activity is first created. */
    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);
        textView = (TextView) findViewById(R.id.TextView01);
    }

    private class DownloadWebPageTask extends AsyncTask<String, Void, String> {
        @Override
        protected String doInBackground(String... urls) {
            String response = "";
            for (String url : urls) {
                DefaultHttpClient client = new DefaultHttpClient();
                HttpGet httpGet = new HttpGet(url);
                try {
                    HttpResponse execute = client.execute(httpGet);
                    InputStream content = execute.getEntity().getContent();

                    BufferedReader buffer = new BufferedReader(
                            new InputStreamReader(content));
                    String s = "";
                    while ((s = buffer.readLine()) != null) {
                        response += s;
                    }

                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
            return response;
        }

        @Override
        protected void onPostExecute(String result) {
            textView.setText(Html.fromHtml(result));
        }
    }

    public void readWebpage(View view) {
        DownloadWebPageTask task = new DownloadWebPageTask();
        task.execute(new String[] { "http://www.google.com" });

    }
}

main.xml中

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:orientation="vertical"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    >

    <Button android:layout_height="wrap_content" android:layout_width="match_parent" android:id="@+id/readWebpage" android:onClick="readWebpage" android:text="Load Webpage"></Button>
    <TextView android:id="@+id/TextView01" android:layout_width="match_parent" android:layout_height="match_parent" android:text="Example Text"></TextView>
</LinearLayout>

答案 1 :(得分:5)

这是我通常用来从互联网上下载字符串的代码

class RequestTask extends AsyncTask<String, String, String>{

@Override
// username, password, message, mobile
protected String doInBackground(String... url) {
    // constants
    int timeoutSocket = 5000;
    int timeoutConnection = 5000;

    HttpParams httpParameters = new BasicHttpParams();
    HttpConnectionParams.setConnectionTimeout(httpParameters, timeoutConnection);
    HttpConnectionParams.setSoTimeout(httpParameters, timeoutSocket);
    HttpClient client = new DefaultHttpClient(httpParameters);

    HttpGet httpget = new HttpGet(url[0]);

    try {
        HttpResponse getResponse = client.execute(httpget);
        final int statusCode = getResponse.getStatusLine().getStatusCode();

        if(statusCode != HttpStatus.SC_OK) {
            Log.w("MyApp", "Download Error: " + statusCode + "| for URL: " + url);
            return null;
        }

        String line = "";
        StringBuilder total = new StringBuilder();

        HttpEntity getResponseEntity = getResponse.getEntity();

        BufferedReader reader = new BufferedReader(new InputStreamReader(getResponseEntity.getContent()));  

        while((line = reader.readLine()) != null) {
            total.append(line);
        }

        line = total.toString();
        return line;
    } catch (Exception e) {
        Log.w("MyApp", "Download Exception : " + e.toString());
    }
    return null;
}

@Override
protected void onPostExecute(String result) {
    // do something with result
}
}

您可以使用

运行任务
  

new RequestTask().execute("http://www.your-get-url.com/");

答案 2 :(得分:2)

看到您对查看内容不感兴趣,请尝试使用以下内容:

为了从URL获取源代码,您可以使用此代码:

HttpClient httpclient = new DefaultHttpClient(); // Create HTTP Client
HttpGet httpget = new HttpGet("http://yoururl.com"); // Set the action you want to do
HttpResponse response = httpclient.execute(httpget); // Executeit
HttpEntity entity = response.getEntity(); 
InputStream is = entity.getContent(); // Create an InputStream with the response
BufferedReader reader = new BufferedReader(new InputStreamReader(is, "iso-8859-1"), 8);
StringBuilder sb = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null) // Read line by line
    sb.append(line + "\n");

String resString = sb.toString(); // Result is here

is.close(); // Close the stream

确保您在AsyncTaskThread中的主UI线程中运行此功能。