android - 使用Jsoup

时间:2016-03-17 13:40:03

标签: android text jsoup webpage

我试图从一个名为" text"的div类中抓取一些网页上的文字。与Jsoup。这是我尝试抓取内容的代码的一部分:

try {

Document doc = Jsoup.connect("http://website.com").get();
Elements div = doc.select["meta[class=text]");
String textString = div.toString();
}

catch (IOException e) {
e.printStackTrace();
}

当我运行活动时,它会在我尝试连接的行中显示错误。这是来自logcat的内容。

  

03-17 14:30:34.270 23413-23413 /?我/艺术:延迟启用-Xcheck:jni   03-17 14:30:35.170 23413-23413 / com.example.goliath.pomos I / View:   ssignParent(ViewParent parent)parent是:   android.view.ViewRootImpl@fc40abe 03-17 14:30:35.370   23413-23552 / com.example.goliath.pomos I / OpenGLRenderer:已初始化   EGL,版本1.4 03-17 14:30:37.580   23413-23413 / com.example.goliath.pomos E / AndroidRuntime:致命   例外:主03-17 14:30:37.580   23413-23413 / com.example.goliath.pomos E / AndroidRuntime:流程:   com.example.goliath.pomos,PID:23413 03-17 14:30:37.580   23413-23413 / com.example.goliath.pomos E / AndroidRuntime:   android.os.NetworkOnMainThreadException 03-17 14:30:37.580   23413-23413 / com.example.goliath.pomos E / AndroidRuntime:at   android.os.StrictMode $ AndroidBlockGuardPolicy.onNetwork(StrictMode.java:1167)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   java.net.InetAddress.lookupHostByName(InetAddress.java:418)03-17   14:30:37.580 23413-23413 / com.example.goliath.pomos E / AndroidRuntime:
  在java.net.InetAddress.getAllByNameImpl(InetAddress.java:252)03-17   14:30:37.580 23413-23413 / com.example.goliath.pomos E / AndroidRuntime:
  在java.net.InetAddress.getAllByName(InetAddress.java:215)03-17   14:30:37.580 23413-23413 / com.example.goliath.pomos E / AndroidRuntime:
  在   com.android.okhttp.HostResolver $ 1.getAllByName(HostResolver.java:29)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   com.android.okhttp.internal.http.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:232)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   com.android.okhttp.internal.http.RouteSelector.next(RouteSelector.java:124)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   com.android.okhttp.internal.http.HttpEngine.connect(HttpEngine.java:272)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   com.android.okhttp.internal.http.HttpEngine.sendRequest(HttpEngine.java:211)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   com.android.okhttp.internal.http.HttpURLConnectionImpl.execute(HttpURLConnectionImpl.java:373)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   com.android.okhttp.internal.http.HttpURLConnectionImpl.connect(HttpURLConnectionImpl.java:106)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   org.jsoup.helper.HttpConnection $ Response.execute(HttpConnection.java:512)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   org.jsoup.helper.HttpConnection $ Response.execute(HttpConnection.java:493)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   org.jsoup.helper.HttpConnection.execute(HttpConnection.java:205)03-17   14:30:37.580 23413-23413 / com.example.goliath.pomos E / AndroidRuntime:
  在org.jsoup.helper.HttpConnection.get(HttpConnection.java:194)03-17   14:30:37.580 23413-23413 / com.example.goliath.pomos E / AndroidRuntime:
  在   com.example.goliath.pomos.Koli.onNavigationItemSelected(Koli.java:120)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   android.support.design.widget.NavigationView $ 1.onMenuItemSelected(NavigationView.java:150)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   android.support.v7.internal.view.menu.MenuBuilder.dispatchMenuItemSelected(MenuBuilder.java:811)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   android.support.v7.internal.view.menu.MenuItemImpl.invoke(MenuItemImpl.java:153)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   android.support.v7.internal.view.menu.MenuBuilder.performItemAction(MenuBuilder.java:958)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   android.support.design.internal.NavigationMenuPresenter $ 1.onClick(NavigationMenuPresenter.java:300)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   android.view.View.performClick(View.java:4768)03-17 14:30:37.580   23413-23413 / com.example.goliath.pomos E / AndroidRuntime:at   android.view.View $ PerformClick.run(View.java:19692)03-17 14:30:37.580   23413-23413 / com.example.goliath.pomos E / AndroidRuntime:at   android.os.Handler.handleCallback(Handler.java:739)03-17 14:30:37.580   23413-23413 / com.example.goliath.pomos E / AndroidRuntime:at   android.os.Handler.dispatchMessage(Handler.java:95)03-17 14:30:37.580   23413-23413 / com.example.goliath.pomos E / AndroidRuntime:at   android.os.Looper.loop(Looper.java:135)03-17 14:30:37.580   23413-23413 / com.example.goliath.pomos E / AndroidRuntime:at   android.app.ActivityThread.main(ActivityThread.java:5538)03-17   14:30:37.580 23413-23413 / com.example.goliath.pomos E / AndroidRuntime:
  在java.lang.reflect.Method.invoke(Native Method)03-17 14:30:37.580   23413-23413 / com.example.goliath.pomos E / AndroidRuntime:at   java.lang.reflect.Method.invoke(Method.java:372)03-17 14:30:37.580   23413-23413 / com.example.goliath.pomos E / AndroidRuntime:at   com.android.internal.os.ZygoteInit $ MethodAndArgsCaller.run(ZygoteInit.java:958)   03-17 14:30:37.580 23413-23413 / com.example.goliath.pomos   E / AndroidRuntime:at   com.android.internal.os.ZygoteInit.main(ZygoteInit.java:753)03-17   14:30:37.610 23413-23413 / com.example.goliath.pomos I / Process:发送   信号。 PID:23413 SIG:9

这是我第一次使用Jsoup,所以任何帮助都会受到赞赏。

2 个答案:

答案 0 :(得分:1)

当应用程序尝试在主线程上执行网络操作时,抛出此异常。您应该在AsyncTask中运行代码,或禁用检查(错误选择):

AsyncTask:

public class MainActivity extends AppCompatActivity {

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        new ParsePageTask().execute("http://stackoverflow.com/");
    }

    class ParsePageTask extends AsyncTask<String, Void, String> {
        protected String doInBackground(String... urls) {
            try {
                Document doc = Jsoup.connect(urls[0]).get();
                Elements div = doc.select("title");
                return div.toString();
            } catch (Exception ignored) {
            }

            return "";
        }

        protected void onPostExecute(String result) {
            // process results
            ((TextView) findViewById(R.id.text)).setText(result);
        }
    }
}

“主要线程上的网络”政策禁用:

StrictMode.ThreadPolicy policy = new StrictMode.ThreadPolicy.Builder().permitAll().build();
StrictMode.setThreadPolicy(policy);
...
Document doc = Jsoup.connect("http://website.com").get();

另外,您应该检查AndroidManifest.xml文件中的互联网权限:

<uses-permission android:name="android.permission.INTERNET"/>

答案 1 :(得分:0)

两个问题:

  1. 在Android中,您无法在主线程中使用网络。您需要在ordedr中创建AsyncTask以使用Jsoup.connect(URL)方法。

  2. 如果您要选择名称属性为div的{​​{1}},则需要使用text。你做了什么,选择带有一个名为meta的标签的元素,并且只携带一个名为select(div[name=text])的类。