PHP Pthread似乎有随机内存管理和内存泄漏

时间:2015-10-27 19:49:44

标签: php multithreading memory-leaks pthreads threadpool

所以,我在这里遇到了一个死胡同。我已经尝试了所有我知道的以隔离内存泄漏和我收集的内容,这似乎与使用pthread进行多线程化这个脚本有关。

我正在为维基百科写一个机器人,我即将完成。从功能上讲,该程序是健全的,并且在多线程和单线程中都可以正常工作。只有在打开多线程时才会发生内存泄漏。

两个版本在同一个脚本/文件上使用完全相同的功能,以便于调试。

线程所在的引擎。

//Multithread engine

//This thread class allows for asyncronous function calls.  This is useful for the functions that consume time and can run in the background.
//Caution must be excercised to ensure that the functions are thread safe.
class AsyncFunctionCall extends Thread {

    protected $method;
    protected $params;
    public $result;

    public function __construct( $method, $params ) {
        $this->method = $method;
        $this->params = $params;
        $this->result = null; 
    }

    public function run() {
        if (($this->result=call_user_func_array($this->method, $this->params))) {
            return true;
        } else return false;
    }

    public static function call($method, $params){
        $thread = new AsyncFunctionCall($method, $params);
        if($thread->start()){
            return $thread;
        } else {
            echo "Unable to initiate background function $method!\n";
            return false;
        }
    }
}

// Analyze multiple pages simultaneously and edit them.
class ThreadedBot extends Collectable {

    protected $page, $pageid, $alreadyArchived, $ARCHIVE_ALIVE, $TAG_OVERRIDE, $ARCHIVE_BY_ACCESSDATE, $TOUCH_ARCHIVE, $DEAD_ONLY, $NOTIFY_ERROR_ON_TALK, $NOTIFY_ON_TALK, $TALK_MESSAGE_HEADER, $TALK_MESSAGE, $TALK_ERROR_MESSAGE_HEADER, $TALK_ERROR_MESSAGE, $DEADLINK_TAGS, $CITATION_TAGS, $IGNORE_TAGS, $ARCHIVE_TAGS, $VERIFY_DEAD, $LINK_SCAN;

    public $result;

    public function __construct($page, $pageid, $alreadyArchived, $ARCHIVE_ALIVE, $TAG_OVERRIDE, $ARCHIVE_BY_ACCESSDATE, $TOUCH_ARCHIVE, $DEAD_ONLY, $NOTIFY_ERROR_ON_TALK, $NOTIFY_ON_TALK, $TALK_MESSAGE_HEADER, $TALK_MESSAGE, $TALK_ERROR_MESSAGE_HEADER, $TALK_ERROR_MESSAGE, $DEADLINK_TAGS, $CITATION_TAGS, $IGNORE_TAGS, $ARCHIVE_TAGS, $VERIFY_DEAD, $LINK_SCAN) {
        $this->page = $page;
        $this->pageid = $pageid;
        $this->alreadyArchived = $alreadyArchived;
        $this->ARCHIVE_ALIVE = $ARCHIVE_ALIVE;
        $this->TAG_OVERRIDE = $TAG_OVERRIDE;
        $this->ARCHIVE_BY_ACCESSDATE = $ARCHIVE_BY_ACCESSDATE;
        $this->TOUCH_ARCHIVE = $TOUCH_ARCHIVE;
        $this->DEAD_ONLY = $DEAD_ONLY;
        $this->NOTIFY_ERROR_ON_TALK = $NOTIFY_ERROR_ON_TALK;
        $this->NOTIFY_ON_TALK = $NOTIFY_ON_TALK;
        $this->TALK_MESSAGE_HEADER = $TALK_MESSAGE_HEADER;
        $this->TALK_MESSAGE = $TALK_MESSAGE;
        $this->TALK_ERROR_MESSAGE_HEADER = $TALK_ERROR_MESSAGE_HEADER;
        $this->TALK_ERROR_MESSAGE = $TALK_ERROR_MESSAGE;
        $this->DEADLINK_TAGS = $DEADLINK_TAGS;
        $this->CITATION_TAGS = $CITATION_TAGS;
        $this->IGNORE_TAGS = $IGNORE_TAGS;
        $this->ARCHIVE_TAGS = $ARCHIVE_TAGS;
        $this->VERIFY_DEAD = $VERIFY_DEAD;
        $this->LINK_SCAN = $LINK_SCAN;    
    }

    public function run() {
        ini_set( 'memory_limit', '1G' );
        echo ini_get( 'memory_limit' )."; ".(memory_get_usage( true )/1024/1024)." MB\n";
        $this->result = analyzePage( $this->page, $this->pageid, $this->alreadyArchived, $this->ARCHIVE_ALIVE, $this->TAG_OVERRIDE, $this->ARCHIVE_BY_ACCESSDATE, $this->TOUCH_ARCHIVE, $this->DEAD_ONLY, $this->NOTIFY_ERROR_ON_TALK, $this->NOTIFY_ON_TALK, $this->TALK_MESSAGE_HEADER, $this->TALK_MESSAGE, $this->TALK_ERROR_MESSAGE_HEADER, $this->TALK_ERROR_MESSAGE, $this->DEADLINK_TAGS, $this->CITATION_TAGS, $this->IGNORE_TAGS, $this->ARCHIVE_TAGS, $this->VERIFY_DEAD, $this->LINK_SCAN);
        $this->setGarbage();
        $this->page = null;
        $this->pageid = null;
        $this->alreadyArchived = null;
        $this->ARCHIVE_ALIVE = null;
        $this->TAG_OVERRIDE = null;
        $this->ARCHIVE_BY_ACCESSDATE = null;
        $this->TOUCH_ARCHIVE = null;
        $this->DEAD_ONLY = null;
        $this->NOTIFY_ERROR_ON_TALK = null;
        $this->NOTIFY_ON_TALK = null;
        $this->TALK_MESSAGE_HEADER = null;
        $this->TALK_MESSAGE = null;
        $this->TALK_ERROR_MESSAGE_HEADER = null;
        $this->TALK_ERROR_MESSAGE = null;
        $this->DEADLINK_TAGS = null;
        $this->CITATION_TAGS = null;
        $this->IGNORE_TAGS = null;
        $this->ARCHIVE_TAGS = null;
        $this->VERIFY_DEAD = null;
        $this->LINK_SCAN = null;
        unset( $this->page, $this->pageid, $this->alreadyArchived, $this->ARCHIVE_ALIVE, $this->TAG_OVERRIDE, $this->ARCHIVE_BY_ACCESSDATE, $this->TOUCH_ARCHIVE, $this->DEAD_ONLY, $this->NOTIFY_ERROR_ON_TALK, $this->NOTIFY_ON_TALK, $this->TALK_MESSAGE_HEADER, $this->TALK_MESSAGE, $this->TALK_ERROR_MESSAGE_HEADER, $this->TALK_ERROR_MESSAGE, $this->DEADLINK_TAGS, $this->CITATION_TAGS, $this->IGNORE_TAGS, $this->ARCHIVE_TAGS, $this->VERIFY_DEAD, $this->LINK_SCAN );
    }
}

这个程序体中的块调用线程引擎。

if( WORKERS === false ) {
    foreach( $pages as $tid => $tpage ) {
        $pagesAnalyzed++;
        $stats = analyzePage( $tpage['title'], $tpage['pageid'], $alreadyArchived, $ARCHIVE_ALIVE, $TAG_OVERRIDE, $ARCHIVE_BY_ACCESSDATE, $TOUCH_ARCHIVE, $DEAD_ONLY, $NOTIFY_ERROR_ON_TALK, $NOTIFY_ON_TALK, $TALK_MESSAGE_HEADER, $TALK_MESSAGE, $TALK_ERROR_MESSAGE_HEADER, $TALK_ERROR_MESSAGE, $DEADLINK_TAGS, $CITATION_TAGS, $IGNORE_TAGS, $ARCHIVE_TAGS, $VERIFY_DEAD, $LINK_SCAN );
        if( $stats['pagemodified'] === true ) $pagesModified++;
        $linksAnalyzed += $stats['linksanalyzed'];
        $linksArchived += $stats['linksarchived'];
        $linksFixed += $stats['linksrescued'];
        $linksTagged += $stats['linkstagged'];
        $alreadyArchived = array_merge( $stats['newlyArchived'], $alreadyArchived );
        $failedToArchive = array_merge( $failedToArchive, $stats['archiveProblems'] );
        $allerrors = array_merge( $allerrors, $stats['errors'] );
        file_put_contents( $dlaaLocation, serialize( $alreadyArchived ) );
    }
} else {
    //for( $i = 0; $i < count( $pages ); $i += $workerLimit ) {
        $workerQueue = new Pool( $workerLimit );
        //$tpages = array_slice( $pages, $i, $workerLimit );
        foreach( $pages as $tid => $tpage ) {
            $pagesAnalyzed++;
            echo "Submitted {$tpage['title']}, job ".($tid+1)." for analyzing...\n";
            $workerQueue->submit( new ThreadedBot( $tpage['title'], $tpage['pageid'], $alreadyArchived, $ARCHIVE_ALIVE, $TAG_OVERRIDE, $ARCHIVE_BY_ACCESSDATE, $TOUCH_ARCHIVE, $DEAD_ONLY, $NOTIFY_ERROR_ON_TALK, $NOTIFY_ON_TALK, $TALK_MESSAGE_HEADER, $TALK_MESSAGE, $TALK_ERROR_MESSAGE_HEADER, $TALK_ERROR_MESSAGE, $DEADLINK_TAGS, $CITATION_TAGS, $IGNORE_TAGS, $ARCHIVE_TAGS, $VERIFY_DEAD, $LINK_SCAN ) );

        }
        $workerQueue->shutdown();
        $workerQueue->collect(
        function( $thread ) {
            global $pagesModified, $linksAnalyzed, $linksArchived, $linksFixed, $linksTagged, $alreadyArchived, $failedToArchive, $allerrors;
            $stats = $thread->result;
            if( $stats['pagemodified'] === true ) $pagesModified++;
            $linksAnalyzed += $stats['linksanalyzed'];
            $linksArchived += $stats['linksarchived'];
            $linksFixed += $stats['linksrescued'];
            $linksTagged += $stats['linkstagged'];
            $alreadyArchived = array_merge( $stats['newlyArchived'], $alreadyArchived );
            $failedToArchive = array_merge( $failedToArchive, $stats['archiveProblems'] );
            $allerrors = array_merge( $allerrors, $stats['errors'] );
            return $thread->isGarbage();
        });
        echo "!!!!!!!!!!!!!!Links analyzed so far: $linksAnalyzed\n\n";
        file_put_contents( $dlaaLocation, serialize( $alreadyArchived ) );
        //$workerQueue = null;
        //unset( $workerQueue );
    //}
}

如上所示,if语句决定是多线程还是单线程。 一些注意事项,$ workerLimit = 20,在函数中初始化的所有资源都被关闭,无效和未设置,由于函数调用没有内存泄漏,memory_limit已被确认为1G,工作人员最终将崩溃与OOM致命错误,内存分配似乎是在worker之间随机分配的,每个worker逐渐使用越来越多的内存,脚本本身在崩溃之前根据任务管理器达到700 MB,最后我添加的工人越多,每次崩溃的速度就越快工人和100名工人立即造成了崩溃。

这是输出的一部分。

Analyzed Stanley Hartt (8742961)
Rescued: 0; Tagged dead: 0; Archived: 0; Max System Memory Used: 1.25 MB

PHP Fatal error:  Out of memory (allocated 46661632) (tried to allocate 6557907 bytes) in C:\Users\Maximilian Doerr\Documents\GitHub\Cyberbot_II\deadlink.php on line 1259

Fatal error: Out of memory (allocated 46661632) (tried to allocate 6557907 bytes) in C:\Users\Maximilian Doerr\Documents\GitHub\Cyberbot_II\deadlink.php on line 1259
Analyzed High-explosive anti-tank warhead (255968)
Rescued: 0; Tagged dead: 0; Archived: 5; Max System Memory Used: 22.75 MB

PHP Fatal error:  Out of memory (allocated 14680064) (tried to allocate 6341940 bytes) in C:\Users\Maximilian Doerr\Documents\GitHub\Cyberbot_II\deadlink.php on line 1261

Fatal error: Out of memory (allocated 14680064) (tried to allocate 6341940 bytes) in C:\Users\Maximilian Doerr\Documents\GitHub\Cyberbot_II\deadlink.php on line 1261
PHP Fatal error:  Out of memory (allocated 6291456) (tried to allocate 5243257 bytes) in C:\Users\Maximilian Doerr\Documents\GitHub\Cyberbot_II\deadlink.php on line 1259

Fatal error: Out of memory (allocated 6291456) (tried to allocate 5243257 bytes) in C:\Users\Maximilian Doerr\Documents\GitHub\Cyberbot_II\deadlink.php on line 1259
PHP Fatal error:  Out of memory (allocated 7864320) (tried to allocate 5245685 bytes) in C:\Users\Maximilian Doerr\Documents\GitHub\Cyberbot_II\deadlink.php on line 1259

Fatal error: Out of memory (allocated 7864320) (tried to allocate 5245685 bytes) in C:\Users\Maximilian Doerr\Documents\GitHub\Cyberbot_II\deadlink.php on line 1259
Analyzed Nadezhda Tylik (2896780)
Rescued: 0; Tagged dead: 0; Archived: 5; Max System Memory Used: 2.75 MB

这是我第一次多线程,所以我是新手,所以我很感激任何帮助和建议,如果你有更多问题,请问。 : - )

1 个答案:

答案 0 :(得分:0)

事实证明这不是来自pthreads。相反,多线程只是使问题更加明显。事实证明我使用的是multiurl,并且由于使用了错误的函数来关闭句柄,尽管句柄被关闭,但内存并没有被释放。