如果URL是Selenium Webdriver上的查询字符串,是否可以从PDF提取数据

时间:2018-11-21 13:02:23

标签: java selenium pdfbox

我的URL是www.test.com/PDFFile/GeneratePDF?queryString=PWTBxi3jD0330OuqnuaNT

我的情况是我想在打开pdf,单击pdf并关闭选项卡时单击链接。下面是我的代码

 ArrayList<String> tabs = new ArrayList<String>(driver.getWindowHandles());
            System.out.println("No. of tabs: " + tabs.size());
            driver.switchTo().window(tabs.get(1));
            System.out.println(driver.getCurrentUrl());
            URL url = new URL(driver.getCurrentUrl());
            InputStream is = url.openStream();
            BufferedInputStream fileToParse = new BufferedInputStream(is);
            PDDocument document = null;
            try {
                document = PDDocument.load(fileToParse);
                output = new PDFTextStripper().getText(document);
            } finally {
                if (document != null) {
                    document.close();
                }
                fileToParse.close();
                is.close();
            }
            words = output.split("\\n");
            return words;

该代码引发错误,因为java.io.IOException:服务器在sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1876)返回URL的http://www.test.com/PDFFile/GeneratePDF?queryString=PWTBxi3jD0330OuqnuaNT的HTTP响应代码:500         在sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)         在java.net.URL.openStream(URL.java:1045)         在com.hmids.utilities.PDFDataExtractor.readPDFInURL(PDFDataExtractor.java:37)

如何处理

0 个答案:

没有答案