如何在Android中获取html源代码?

时间:2012-09-18 15:42:45

标签: android html eclipse parsing

我需要在Android中将网页源提取为字符串。我尝试使用HttpClient,HttpGet和HttpResponse来做,但这种方法不起作用。我必须对此方法的每次初始化进行try / catch,无论如何,应用程序强制关闭。

  public static String getHtmlSource(String webPage){
            HttpClient client = new DefaultHttpClient();
            URI url = null;
            try {
                url = new URI("http://www.kinopoisk.ru/film/581493");
            } catch (URISyntaxException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
            HttpGet request = new HttpGet(url);
            HttpResponse response = null;
            try {
                response = client.execute(request);
            } catch (ClientProtocolException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }

            String html = "";
            InputStream in = null;
            try {
                in = response.getEntity().getContent();
            } catch (IllegalStateException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
            BufferedReader reader = new BufferedReader(new InputStreamReader(in));
            StringBuilder str = new StringBuilder();
            String line = null;
            try {
                while((line = reader.readLine()) != null)
                {
                    str.append(line);
                }
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
            try {
                in.close();
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
            html = str.toString();
            return html;
        }

logcat的

09-18 19:28:07.989: W/dalvikvm(5708): threadid=1: thread exiting with uncaught exception (group=0x41f64300)
09-18 19:28:08.029: E/AndroidRuntime(5708): FATAL EXCEPTION: main
09-18 19:28:08.029: E/AndroidRuntime(5708): java.lang.RuntimeException: Unable to start activity ComponentInfo{com.gavrilov.egor.movies/com.gavrilov.egor.movies.MoviesList}: android.os.NetworkOnMainThreadException
09-18 19:28:08.029: E/AndroidRuntime(5708):     at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2059)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2084)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at android.app.ActivityThread.access$600(ActivityThread.java:130)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1195)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at android.os.Handler.dispatchMessage(Handler.java:99)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at android.os.Looper.loop(Looper.java:137)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at android.app.ActivityThread.main(ActivityThread.java:4745)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at java.lang.reflect.Method.invokeNative(Native Method)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at java.lang.reflect.Method.invoke(Method.java:511)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:786)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:553)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at dalvik.system.NativeStart.main(Native Method)
09-18 19:28:08.029: E/AndroidRuntime(5708): Caused by: android.os.NetworkOnMainThreadException
09-18 19:28:08.029: E/AndroidRuntime(5708):     at android.os.StrictMode$AndroidBlockGuardPolicy.onNetwork(StrictMode.java:1117)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at java.net.InetAddress.lookupHostByName(InetAddress.java:385)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at java.net.InetAddress.getAllByNameImpl(InetAddress.java:236)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at java.net.InetAddress.getAllByName(InetAddress.java:214)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:137)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:164)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:119)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:360)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:555)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:487)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:465)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at com.gavrilov.egor.movies.ParseHtmlPage.getHtmlSource(ParseHtmlPage.java:32)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at com.gavrilov.egor.movies.MoviesList.onCreate(MoviesList.java:30)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at android.app.Activity.performCreate(Activity.java:5008)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1079)
09-18 19:28:08.029: E/AndroidRuntime(5708):     at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2023)

2 个答案:

答案 0 :(得分:1)

你可以在清单中没有INTERNET权限吗?

编辑:啊,你是在主线程上做的那样,不是吗?

答案 1 :(得分:0)

请参阅此问题以解决异常:How to fix android.os.NetworkOnMainThreadException?。 对于HTML解析,您可以使用:http://jsoup.org/