如何阅读网页上的具体内容&通过Web驱动程序存储在Excel工作表中

时间:2015-05-27 12:28:10

标签: java selenium

我们有一个显示如下表所示的应用程序

Group ID       Group Name             Organization ID  Organization Type User Limit    
GP00000517001  SIPtest Site hostpool  SIP00000517      Enterprise        10000     
GP8566747001   SIT mars test SIP te   SIP8566747       Enterprise        10000 

我们在群组ID的同一行中有大约500条记录, 我想从这个页面只阅读组ID&存储在Excel工作表中,我该怎么做?请建议。

1 个答案:

答案 0 :(得分:0)

I recommend you to use [HtmlParser][1] libs to read page. 
because it will help you to identify your table content using ID OR Tag Name.
Jsoup example-
         String url = "http://www.google.com/";
         Document document = Jsoup.connect(url).get();
         String question = document.select("#_eEe").text();

     then use [Apache POI][2] to save data to excel sheet..

     HSSFWorkbook workbook = new HSSFWorkbook();
     HSSFSheet sheet = workbook.createSheet("Sample");
     Row row = sheet.createRow(1);           
     Cell cell = row.createCell(1);
     cell.setCellValue("Sample");
     FileOutputStream out =  new FileOutputStream(new File("C:\\new.xls"));
     workbook.write(out);
     out.close();

      [1]: http://htmlparser.sourceforge.net/
      [2]: https://poi.apache.org/spreadsheet/how-to.html