Question

我正在编写一个程序来在同一个服务器上获取域名，它也可以扫描网络目录。

#!/usr/bin/env python
#encoding = utf-8
import threading
import urllib,urllib2,httplib
from urllib2 import Request, urlopen, URLError

import Queue,sys
import re
concurrent = 5
url = sys.argv[1]

class Scanner(threading.Thread):


    def __init__(self, work_q):
        threading.Thread.__init__(self)
        self.work_q = work_q


    def getdomains(self):
        doreq = Request('http://www.logontube.com/website/'+ url)
        response = urlopen(doreq)
        html = response.read()
        response.close()
        domains = re.findall('<br><a href=\"(.*?)\" target=\"_blank\"',html)
        return domains

    def run(self):
        alldomains = self.getdomains()
        pathline = [line.rstrip() for line in open("path.txt")]


        while True:
            for aim in alldomains:
                for path in pathline:
                    path = self.work_q.get()

                    req = Request(aim+path)
                    try:
                        response = urlopen(req)
                    except URLError, e:
                        if hasattr(e, 'reason'):
                            print aim+path,'Not Found'
                        elif hasattr(e,'code'):
                            print aim+path,'Not Found'
                    else:
                        try:
                            logs = open('log.txt',"a+")
                        except(IOError):
                            print "[x] Failed to create log file"
                        print aim+path,"Found"
                        logs.writelines(aim+path+"\n")
                        logs.close()

def main():


    work_q = Queue.Queue()
    paths = [line.rstrip() for line in open("path.txt")]
    for i in range(concurrent):
        t = Scanner(work_q)
        t.setDaemon(True)
        t.start()

    for path in paths:
        work_q.put(path)

    work_q.join()


main()

问题是这个程序只做路径的循环，所以我只能得到一个网站的扫描结果。我发现了这个问题，

for path in paths:
        work_q.put(path) # The program finishes when it puts all the path

如果你想帮我测试这个程序，你可能需要一些网站目录（保存为path.txt）

/default.asp
/index.asp
/index.htm
/index.html
/index.jsp
/index.php
/admin.asp
/admin.php
/admin.shtml
/admin.txt
/admin_admin.asp
/config.asp
/inc/
/login.asp
/login.jsp
/login.php
/login/
/phpinfo.php
/readme.txt
/robots.txt
/test.asp
/test.html
/test.txt
/test.php
/news/readme.txt
/addmember/

Answer 1

你需要一个：

while 1:
   pass

或等待线程完成然后退出的东西。

发生的事情是您正在启动线程，但是您正在终止主线程，因此您永远无法看到线程的结果。

在这种情况下，我如何编写queue.put

1 个答案: