Google CustomSearchEngine:如何获取整个ArticleBody?

时间:2017-08-03 13:32:22

标签: json google-api-java-client google-custom-search

我使用CSE抓取以文章,博客等形式发布新闻的网站

链接到我的CSE:https://cse.google.com/cse/publicurl?cx=003284443790305850415:xbxu60ofaec

我的工作是实现一个程序,以JSON格式提取结果并分析文章体。不幸的是,文章主体(属性/值对)会自动缩短,所以我根本不会得到整篇文章。例如:

  

" articlebody":"如何在无根Android手机上利用路由器RouterSploit是一个类似于Metasploit的强大漏洞利用框架,致力于快速识别和利用路由器中的常见漏洞.... #34;

有没有办法用JSON获取整个文章?

我现在的代码:

public class CustomSearchAPI {

public static void main(String[] args) throws Exception {

    String key="AIzaSyALOC-8_qk_IrT3MEx8JzQ2MmXPbtlBhJw";
    String qry="exploit";
    URL url = new URL(
            "https://www.googleapis.com/customsearch/v1?key="+key+ "&cx=003284443790305850415:xbxu60ofaec&q="+ qry + "&alt=json");
    HttpURLConnection conn = (HttpURLConnection) url.openConnection();
    conn.setRequestMethod("GET");
    conn.setRequestProperty("Accept", "application/json");
    BufferedReader br = new BufferedReader(new InputStreamReader(
            (conn.getInputStream())));


    String output;
    System.out.println("Output from Server .... \n");
    while ((output = br.readLine()) != null) {
        System.out.println(output);
    }
    conn.disconnect();
}


}

我在pom.xml中的依赖项:

<dependencies>
    <!-- https://mvnrepository.com/artifact/org.mongodb/mongo-java-driver -->
    <dependency>
        <groupId>org.mongodb</groupId>
        <artifactId>mongo-java-driver</artifactId>
        <version>3.4.2</version>
    </dependency>
    <dependency>
        <groupId>com.google.apis</groupId>
        <artifactId>google-api-services-customsearch</artifactId>
        <version>v1-rev56-1.22.0</version>
    </dependency>
</dependencies>

0 个答案:

没有答案