通过使用python,我如何检查网站是否已启动?根据我的阅读,我需要检查“HTTP HEAD”并查看状态代码“200 OK”,但该怎么做?
干杯
答案 0 :(得分:76)
您可以尝试使用urllib
中的getcode()
执行此操作
>>> print urllib.urlopen("http://www.stackoverflow.com").getcode()
>>> 200
编辑:对于更现代的python,即python3
,请使用:
import urllib.request
print(urllib.request.urlopen("http://www.stackoverflow.com").getcode())
>>> 200
答案 1 :(得分:15)
我认为最简单的方法是使用Requests模块。
import requests
def url_ok(url):
r = requests.head(url)
return r.status_code == 200
答案 2 :(得分:9)
您可以使用httplib
import httplib
conn = httplib.HTTPConnection("www.python.org")
conn.request("HEAD", "/")
r1 = conn.getresponse()
print r1.status, r1.reason
打印
200 OK
当然,只有www.python.org
出现。
答案 3 :(得分:7)
import httplib
import socket
import re
def is_website_online(host):
""" This function checks to see if a host name has a DNS entry by checking
for socket info. If the website gets something in return,
we know it's available to DNS.
"""
try:
socket.gethostbyname(host)
except socket.gaierror:
return False
else:
return True
def is_page_available(host, path="/"):
""" This function retreives the status code of a website by requesting
HEAD data from the host. This means that it only requests the headers.
If the host cannot be reached or something else goes wrong, it returns
False.
"""
try:
conn = httplib.HTTPConnection(host)
conn.request("HEAD", path)
if re.match("^[23]\d\d$", str(conn.getresponse().status)):
return True
except StandardError:
return None
答案 4 :(得分:5)
from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError
req = Request("http://stackoverflow.com")
try:
response = urlopen(req)
except HTTPError as e:
print('The server couldn\'t fulfill the request.')
print('Error code: ', e.code)
except URLError as e:
print('We failed to reach a server.')
print('Reason: ', e.reason)
else:
print ('Website is working fine')
适用于Python 3
答案 5 :(得分:4)
标准库中httplib
模块的HTTPConnection
对象可能会为您提供帮助。顺便说一句,如果您在Python中开始使用HTTP进行高级操作,请务必查看httplib2
;这是一个很棒的图书馆。
答案 6 :(得分:2)
如果服务器如果关闭,在python 2.7 x86 windows urllib没有超时并且程序转到死锁。所以使用urllib2
import urllib2
import socket
def check_url( url, timeout=5 ):
try:
return urllib2.urlopen(url,timeout=timeout).getcode() == 200
except urllib2.URLError as e:
return False
except socket.timeout as e:
print False
print check_url("http://google.fr") #True
print check_url("http://notexist.kc") #False
答案 7 :(得分:1)
如果用起来,你的意思是“服务器正在服务”,那么你可以使用cURL,如果你得到了响应而不是它。
我无法给你具体的建议,因为我不是python程序员,但是这里有pycurl http://pycurl.sourceforge.net/的链接。
答案 8 :(得分:1)
您好,本课程可以使用此课程对您的网页进行速度和测试:
from urllib.request import urlopen
from socket import socket
import time
def tcp_test(server_info):
cpos = server_info.find(':')
try:
sock = socket()
sock.connect((server_info[:cpos], int(server_info[cpos+1:])))
sock.close
return True
except Exception as e:
return False
def http_test(server_info):
try:
# TODO : we can use this data after to find sub urls up or down results
startTime = time.time()
data = urlopen(server_info).read()
endTime = time.time()
speed = endTime - startTime
return {'status' : 'up', 'speed' : str(speed)}
except Exception as e:
return {'status' : 'down', 'speed' : str(-1)}
def server_test(test_type, server_info):
if test_type.lower() == 'tcp':
return tcp_test(server_info)
elif test_type.lower() == 'http':
return http_test(server_info)
答案 9 :(得分:1)
您可以使用requests
库来查找网站是否已启动,即status code
为200
import requests
url = "https://www.google.com"
page = requests.get(url)
print (page.status_code)
>> 200
答案 10 :(得分:1)
我使用 requests 来做这件事,那么它既简单又干净。 您可以定义和调用新函数(通过电子邮件等通知)而不是 print 函数。 Try-except 块是必不可少的,因为如果主机无法访问,则会引发很多异常,因此您需要将它们全部捕获。
import requests
URL = "https://api.github.com"
try:
response = requests.head(URL)
except Exception as e:
print(f"NOT OK: {str(e)}")
else:
if response.status_code == 200:
print("OK")
else:
print(f"NOT OK: HTTP response code {response.status_code}")
答案 11 :(得分:0)
这是我使用PycURL和validators
的解决方案import pycurl, validators
def url_exists(url):
"""
Check if the given URL really exists
:param url: str
:return: bool
"""
if validators.url(url):
c = pycurl.Curl()
c.setopt(pycurl.NOBODY, True)
c.setopt(pycurl.FOLLOWLOCATION, False)
c.setopt(pycurl.CONNECTTIMEOUT, 10)
c.setopt(pycurl.TIMEOUT, 10)
c.setopt(pycurl.COOKIEFILE, '')
c.setopt(pycurl.URL, url)
try:
c.perform()
response_code = c.getinfo(pycurl.RESPONSE_CODE)
c.close()
return True if response_code < 400 else False
except pycurl.error as err:
errno, errstr = err
raise OSError('An error occurred: {}'.format(errstr))
else:
raise ValueError('"{}" is not a valid url'.format(url))
答案 12 :(得分:0)
# Using requests.
import request
request = requests.get(value)
if request.status_code == 200:
return True
return False
# Using httplib2.
import httplib2
try:
http = httplib2.Http()
response = http.request(value, 'HEAD')
if int(response[0]['status']) == 200:
return True
except:
pass
return False
答案 13 :(得分:0)
我认为caisah's answer遗漏了您问题的重要部分,即处理服务器处于脱机状态。
尽管如此,仍然使用requests
是我最喜欢的选项:
import requests
try:
requests.get(url)
except requests.exceptions.ConnectionError:
print(f"URL {url} not reachable")
答案 14 :(得分:0)
我的2美分
def getResponseCode(url):
conn = urllib.request.urlopen(url)
return conn.getcode()
if getResponseCode(url) != 200:
print('Wrong URL')
else:
print('Good URL')