Question

我试图在python 3.6中开发一个多线程函数，有时我的代码会冻结。从我的测试我认为问题来自os.write（）或os.read（），但我不知道为什么。

这是我的代码（我不认为partialTransform（）导致冻结，但我把它理解为代码）：

def naiveTransform(netData,**kwargs):

        #parralelisable part
        def partialTransform(debut, fin) :
            for i in range(debut, fin) :
                j = 0
                #calcul of all the distances :
                while j < nbrPoint :
                    distance[j] = euclidianDistance(netData[i], netData[j])
                    j += 1

                #construction of the graph :
                j = 0
                del distance[i]
                while j < k :
                    nearest = min(distance, key=distance.get)
                    del distance[nearest]   #if k > 1 we don't want to get always the same point.
                    graph.append([i, nearest])
                    j += 1

            return graph



        k = kwargs.get('k', 1)  # valeur par défault à definir.
        nbrCore = kwargs.get('Core', 1)
        nbrPoint = len(netData)
        nbrPointCore = nbrPoint//nbrCore
        distance = dict()
        graph = []

        #pipes
        r = [-1]*nbrCore
        w = [-1]*nbrCore
        pid = [-1]*nbrCore

        for i in range(nbrCore):
            r[i], w[i] = os.pipe()

            try:
                pid[i] = os.fork()
            except OSError:
                exit("Could not create a child process\n")


            if pid[i] == 0:
                if i < nbrCore-1 :
                    g = partialTransform(i*nbrPointCore, (i+1)*nbrPointCore)
                else :
                    g = partialTransform(i*nbrPointCore, nbrPoint)  #to be sure that there is not a forgoten point.
                print("write in " + str(i))
                import sys
                print(sys.getsizeof(g))
                os.write(w[i], pickle.dumps(g))
                print("exit")
                exit()


        for i in range(nbrCore):
            print("waiting " + str(i))
            finished = os.waitpid(pid[i], 0)
            print("received")
            graph += pickle.loads(os.read(r[i], 250000000))

        return graph

当参数k高于或等于5时，代码冻结

print(sys.getsizeof(g))

对于我的例子，当k = 4时，大小为33928，而对于k = 5，大小为43040，所以我不认为这是问题所在？使用的核心数量似乎对冻结没有任何影响。

我仍然是python的初学者所以它可能是显而易见的但我在互联网上没有发现任何类似的问题。你知道什么可能导致这些冻结吗？

Answer 1

管道具有有限大小的缓冲区，并且子项将阻止写入管道，直到父级读取它。但是父母正在等孩子退出，所以你就挂了。您可以通过将对象写入临时文件来避免缓冲区限制。当父读取时，数据将在操作系统文件缓存中，因此它仍然很快。

这一切都有一招。父进程需要说服libc在子进程写入后重新检查该文件，否则只需通过其0长度的内部缓存来满足读取。您可以使用seek。

执行此操作

import tempfile

def naiveTransform(netData,**kwargs):

        // *** code removed for example ***
        # files
        tmp = [tempfile.TemporaryFile() for _ in range(nbrCore)]
        pid = [-1]*nbrCore

        for i in range(nbrCore):
            try:
                pid[i] = os.fork()
            except OSError:
                exit("Could not create a child process\n")


            if pid[i] == 0:
                if i < nbrCore-1 :
                    g = partialTransform(i*nbrPointCore, (i+1)*nbrPointCore)
                else :
                    g = partialTransform(i*nbrPointCore, nbrPoint)  #to be sure that there is not a forgoten point.
                print("write in " + str(i))
                import sys
                print(sys.getsizeof(g))
                pickle.dump(g, tmp[i])
                tmp[i].close()
                print("exit")
                exit()

        for i in range(nbrCore):
            print("waiting " + str(i))
            finished = os.waitpid(pid[i], 0)
            print("received")
            # seek to get updated file content
            tmp[i].seek(0,2)
            tmp[i].seek(0)
            graph += pickle.load(tmp[i])

        return graph

python3.6中的多线程通信冻结

1 个答案: