I am attempting to connect to URLs in Java to see if they are valid and I am wondering if I need to connect to HTTPS(port 443?) or if connecting to just HTTP(port 80) will be enough.
Does connecting to HTTP for an HTTPS website work? Is there anything with firewalls I should watch out for that wouldn't allow me to do this?
Thanks.
答案 0 :(得分:1)
Since you rephrased your question I'll update my answer accoring to that. To stay with your example: Checking for URLS on port 80 is totally independent from checking urls on port 443. Maybe port 80 leads to the same content as port 443. Maybe port 80 leads to the end-user content, while port 443 leads to the admin-login. Maybe apache operates on port 80 while nginx operates on port 443.
So to get the all of the content, you need to scan both ports. Additionally be prepared to find sometimes two different types of content, that don't have anything to do with each other. Admittedly this will happen rarely but it can happen.
Regarding firewalls:
If a web-service is intended to be public, firewalls will happily allow you to connect to the service. If a web-service is intended to be private and you can connect to it nonetheless, the firewall admin made a mistake :)
HTH
答案 1 :(得分:1)
If you want to check that URLs are "valid" I think you want to know if they respond with a 200 status code to a GET request.
You'll need to check http and https separately if you want to know if they both work. They're two different protocols, and severs handle them differently. Some servers mirror the same content over both protocols, but many of them redirect the HTTP -> HTTPS etc.
Also not every server supports SSL connections, therefore HTTPS might not be available.