我正在编写一个程序来在同一个服务器上获取域名,它也可以扫描网络目录。
#!/usr/bin/env python
#encoding = utf-8
import threading
import urllib,urllib2,httplib
from urllib2 import Request, urlopen, URLError
import Queue,sys
import re
concurrent = 5
url = sys.argv[1]
class Scanner(threading.Thread):
def __init__(self, work_q):
threading.Thread.__init__(self)
self.work_q = work_q
def getdomains(self):
doreq = Request('http://www.logontube.com/website/'+ url)
response = urlopen(doreq)
html = response.read()
response.close()
domains = re.findall('<br><a href=\"(.*?)\" target=\"_blank\"',html)
return domains
def run(self):
alldomains = self.getdomains()
pathline = [line.rstrip() for line in open("path.txt")]
while True:
for aim in alldomains:
for path in pathline:
path = self.work_q.get()
req = Request(aim+path)
try:
response = urlopen(req)
except URLError, e:
if hasattr(e, 'reason'):
print aim+path,'Not Found'
elif hasattr(e,'code'):
print aim+path,'Not Found'
else:
try:
logs = open('log.txt',"a+")
except(IOError):
print "[x] Failed to create log file"
print aim+path,"Found"
logs.writelines(aim+path+"\n")
logs.close()
def main():
work_q = Queue.Queue()
paths = [line.rstrip() for line in open("path.txt")]
for i in range(concurrent):
t = Scanner(work_q)
t.setDaemon(True)
t.start()
for path in paths:
work_q.put(path)
work_q.join()
main()
问题是这个程序只做路径的循环,所以我只能得到一个网站的扫描结果。 我发现了这个问题,
for path in paths:
work_q.put(path) # The program finishes when it puts all the path
如果你想帮我测试这个程序,你可能需要一些网站目录(保存为path.txt)
/default.asp
/index.asp
/index.htm
/index.html
/index.jsp
/index.php
/admin.asp
/admin.php
/admin.shtml
/admin.txt
/admin_admin.asp
/config.asp
/inc/
/login.asp
/login.jsp
/login.php
/login/
/phpinfo.php
/readme.txt
/robots.txt
/test.asp
/test.html
/test.txt
/test.php
/news/readme.txt
/addmember/
答案 0 :(得分:0)
你需要一个:
while 1:
pass
或等待线程完成然后退出的东西。
发生的事情是您正在启动线程,但是您正在终止主线程,因此您永远无法看到线程的结果。