我正在尝试使用HtmlUnit 2.11从网站下载文件。但是,我收到了UnknownHostException。下面是代码和完整的堆栈跟踪:
代码:
final WebClient webClient = new WebClient(
BrowserVersion.INTERNET_EXPLORER_8);
URL Url = new URL("https://340bopais.hrsa.gov/reports");
HtmlPage page = webClient.getPage(Url);
HtmlSubmitInput button = page
.getElementByName("ContentPlaceHolder1_lnkCEDailyReport");
final HtmlPage page2 = button.click();
异常追踪:
Exception in thread "main" java.net.UnknownHostException: 340bopais.hrsa.gov
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(Unknown Source)
at java.net.InetAddress.getAddressesFromNameService(Unknown Source)
at java.net.InetAddress.getAllByName0(Unknown Source)
at java.net.InetAddress.getAllByName(Unknown Source)
at java.net.InetAddress.getAllByName(Unknown Source)
at org.apache.http.impl.conn.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:45)
at org.apache.http.impl.conn.DefaultClientConnectionOperator.resolveHostname(DefaultClientConnectionOperator.java:278)
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:162)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:640)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:479)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:171)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1484)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1402)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:304)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:373)
at src.main.java.DataDownloader.main(DataDownloader.java:30)
答案 0 :(得分:1)
PING(Packet Internet Groper)是一种ICMP(Internet控制管理协议)协议。
HTTPS是传输协议。
许多网络提供商和服务管理员仅限制对必要协议和端口的资源访问。
托管340bopais.hrsa.gov的组织很可能已将防火墙和其他网络基础架构配置为仅允许端口80和443上的TCP流量到其服务器。
更新
我成功了,用java和selenium下载了文件。我将整个代码编成repository,您可以下载我的代码。但在这里,我向你解释如何使用它:
使用Eclipse制作maven项目
将名为driver
的文件夹添加到resource
文件夹
下载this chrome.exe
驱动程序,并将其放入驱动程序文件夹中。
将此相关性添加到pom.xml
:
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
<version>3.4.0</version>
</dependency>
进入主方法类型:
public static void main(String [] args){
File file = new
File(StackApplication.class.getClassLoader().getResource("driver/chromedriver.exe").getFile());
String driverPath=file.getAbsolutePath();
System.out.println("Webdriver is in path: "+driverPath);
System.setProperty("webdriver.chrome.driver",driverPath);
WebDriver driver=new ChromeDriver();
driver.navigate().to("https://340bopais.hrsa.gov/reports");
driver.findElement(By.xpath("//*[@id=\"headingTwo\"]/h4/a")).click();
driver.findElement(By.xpath("//*[@id=\"ContentPlaceHolder1_lnkCEDailyReport\"]")).click();
}
它就像一个魅力
答案 1 :(得分:0)
我认为此网站安全证书存在问题我试图从浏览器运行您的网址completion
。
默认情况下,如果先前未在信任库中安装了无法验证服务器的证书链,则使用URL类访问HTTPS URL会导致异常。如果要为测试目的禁用证书验证,则需要使用信任所有证书的信任管理器覆盖默认信任管理器。
试试这可能会解决你的问题:doStuff