在Python 3中执行的多处理

时间:2015-07-10 19:27:08

标签: python multithreading python-3.x multiprocessing cracking

我正在搞乱一个zip文件破解程序,并决定使用多处理模块加快进程。这是一次彻底的痛苦,因为这是我第一次使用该模块,我甚至还没有完全理解它。但是,我得到了它的工作。

问题在于它没有完成单词列表;它只是在单词列表中随机停止,如果找到密码,它将继续通过单词列表而不是仅停止进程。

有谁知道为什么会出现这种行为?

ZipFile Cracker的源代码

#!/usr/bin/env python3

import multiprocessing as mp
import zipfile # Handeling the zipfile
import sys # Command line arguments, and quiting application
import time # To calculate runtime

def usage(program_name):
    print("Usage: {0} <path to zipfile> <dictionary>".format(program_name))
    sys.exit(1)

def cracker(password):
    try:
        zFile.extractall(pwd=password)
        print("[+] Password Found! : {0}".format(password.decode('utf-8')))
        pool.close()
    except:
        pass

def main():
    global zFile
    global pool

    if len(sys.argv) < 3:
        usage(sys.argv[0])

    zFile = zipfile.ZipFile(sys.argv[1])

    print("[*] Started Cracking")

    startime = time.time()
    pool = mp.Pool()

    for i in open(sys.argv[2], 'r', errors='ignore'):
        pswd = bytes(i.strip('\n'), 'utf-8')
        pool.apply_async(cracker, (pswd,))

    print (pswd)
    runtime =  round(time.time() - startime, 5)
    print ("[*] Runtime:", runtime, 'seconds')
    sys.exit(0)

if __name__ == "__main__":
    main()

3 个答案:

答案 0 :(得分:2)

您过早地终止了您的计划。要对此进行测试,请在time.sleep(10)方法中添加无害cracker,并观察您的程序在一秒钟内仍然终止。

调用join等待池完成:

pool = mp.Pool()
for i in open(sys.argv[2], 'r', errors='ignore'):
    pswd = bytes(i.strip('\n'), 'utf-8')
    pool.apply_async(cracker, (pswd,))

pool.close()  # Indicate that no more data is coming
pool.join()   # Wait for pool to finish processing

runtime =  round(time.time() - startime, 5)
print ("[*] Runtime:", runtime, 'seconds')
sys.exit(0)

此外,一旦找到正确的密码,调用close只表示将来不再有任务 - 所有已提交的任务仍将完成。相反,调用terminate来杀死池而不处理任何其他任务。

此外,根据multiprocessing.Pool的实施细节,全局变量pool可能在您需要时无法使用(并且其值无论如何都不可序列化)。要解决此问题,您可以使用回调,如

def cracker(password):
    try:
        zFile.extractall(pwd=password)
    except RuntimeError:
        return
    return password

 def callback(found):
     if found:
         pool.terminate()
 ...
 pool.apply_async(cracker, (pswd,), callback=cb)

当然,既然您现在一直在查看结果,apply不是正确的方法。相反,您可以使用imap_unordered编写代码:

with open(sys.argv[2], 'r', errors='ignore') as passf, \
         multiprocessing.Pool() as pool:
     passwords = (line.strip('\n').encode('utf-8') for line in passf)
     for found in pool.imap_unordered(cracker, passwords):
         if found:
             break

您可能还希望通过对池使用initializer来打开每个进程中的zip文件(并创建一个ZipFile对象),而不是使用全局变量。更好(并且更快),首先放弃所有I / O并只读取您需要的字节,然后将它们传递给子节点。

答案 1 :(得分:1)

phihag的答案是正确的解决方案。

我只想在您找到正确的密码时提供有关致电terminate()的其他详细信息。我运行代码时未定义pool中的cracker()变量。所以试图从那里调用它只是引发了一个异常:

NameError: name 'pool' is not defined

(我的fork()体验很弱,所以我不完全理解为什么全局zFile成功复制到子进程而pool没有成功。即使它被复制了,它在父进程中不会是pool,对吧?所以调用它的任何方法都不会影响父进程中的真正的池。无论如何,我更喜欢multiprocessing模块编程指南部分中列出的this建议:明确将资源传递给子流程。)

我的建议是让cracker()在密码正确的情况下返回密码,否则返回None。然后将回调传递给记录正确密码的apply_async(),以及终止池。以下是修改代码以执行此操作的方法:

#!/usr/bin/env python3

import multiprocessing as mp
import zipfile # Handeling the zipfile
import sys # Command line arguments, and quiting application
import time # To calculate runtime
import os

def usage(program_name):
    print("Usage: {0} <path to zipfile> <dictionary>".format(program_name))
    sys.exit(1)

def cracker(zip_file_path, password):
    print('[*] Starting new cracker (pid={0}, password="{1}")'.format(os.getpid(), password))

    try:
        time.sleep(1) # XXX: to simulate the task taking a bit of time
        with zipfile.ZipFile(zip_file_path) as zFile:
            zFile.extractall(pwd=bytes(password, 'utf-8'))
        return password
    except:
        return None

def main():
    if len(sys.argv) < 3:
        usage(sys.argv[0])

    print('[*] Starting main (pid={0})'.format(os.getpid()))

    zip_file_path = sys.argv[1]
    password_file_path = sys.argv[2]
    startime = time.time()
    actual_password = None

    with mp.Pool() as pool:
        def set_actual_password(password):
            nonlocal actual_password
            if password:
                print('[*] Found password; stopping future tasks')
                pool.terminate()
                actual_password = password

        with open(password_file_path, 'r', errors='ignore') as password_file:
            for pswd in password_file:
                pswd = pswd.strip('\n')
                pool.apply_async(cracker, (zip_file_path, pswd,), callback=set_actual_password)

        pool.close()
        pool.join()

    if actual_password:
        print('[*] Cracked password: "{0}"'.format(actual_password))
    else:
        print('[*] Unable to crack password')
    runtime =  round(time.time() - startime, 5)
    print("[*] Runtime:", runtime, 'seconds')
    sys.exit(0)

if __name__ == "__main__":
    main()

答案 2 :(得分:0)

以下是来自@phihag's@Equality 7-2521's answers的建议的实施:

#!/usr/bin/env python3
"""Brute force zip password.

Usage: brute-force-zip-password <zip archive> <passwords>
"""
import sys
from multiprocessing import Pool
from time import monotonic as timer
from zipfile import ZipFile

def init(archive): # run at the start of a worker process
    global zfile
    zfile = ZipFile(open(archive, 'rb')) # open file in each process once

def check(password):
    assert password
    try:
        with zfile.open(zfile.infolist()[0], pwd=password):
            return password # assume success
    except Exception as e:
        if e.args[0] != 'Bad password for file':
            # assume all other errors happen after the password was accepted
            raise RuntimeError(password) from e

def main():
    if len(sys.argv) != 3:
        sys.exit(__doc__) # print usage

    start = timer()
    # decode passwords using the preferred locale encoding
    with open(sys.argv[2], errors='ignore') as file, \
         Pool(initializer=init, initargs=[sys.argv[1]]) as pool: # use all CPUs
        # check passwords encoded using utf-8
        passwords = (line.rstrip('\n').encode('utf-8') for line in file)
        passwords = filter(None, passwords) # filter empty passwords
        for password in pool.imap_unordered(check, passwords, chunksize=100):
            if password is not None:  # found
                print("Password: '{}'".format(password.decode('utf-8')))
                break
        else:
            sys.exit('Unable to find password')
    print('Runtime: %.5f seconds' % (timer() - start,))

if __name__=="__main__":
    main()

注意:

  • 每个工作进程都有自己的ZipFile对象,并且每个进程都会打开一次zip文件:它应该使其更具可移植性(Windows支持)并提高时间性能
  • 未提取内容:check(password)尝试打开并在成功时立即关闭存档成员:它更安全,可以提高时间性能(无需创建目录等)
  • 在解密归档成员时,'Bad password for file'以外的所有错误都假定在接受密码后发生:理性是为了避免出现意外错误 - 应该单独考虑每个异常< / LI>
  • check(password)期望非空密码
  • chunksize参数可以大幅提升效果
  • 使用罕见的for / else语法来报告未找到密码的情况
  • with - 语句为您调用pool.terminate()