Question

我有一个PHP脚本，用于查询API并下载一些JSON信息/将该信息插入MySQL数据库，我们将调用此脚本A.php。我需要多次运行此脚本作为分钟，最好是在一分钟内尽可能多次运行，而不允许两个实例在相同的时间或任何重叠运行。我的解决方案是创建scriptB.php并输入一分钟的cron作业。这是scriptB.php的源代码......

function next_run()
{
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_URL, "http://somewebsite.com/scriptA.php");
    curl_exec($curl);
    curl_close($curl);
    unset($curl);
}
$i = 0;
$times_to_run = 7;
$function = array();
while ($i++ < $times_to_run) {
    $function = next_run();
    sleep(3);
}

我现在的问题是cURL在循环中使用时的执行情况，这个代码是否会触发scriptA.php，然后在它完成加载后启动下一个cURL请求？ 3秒睡眠是否会产生影响，或者这个字面上的运行速度与每个cURL请求完成的时间一样快。我的目标是计算这个脚本，并在一分钟的窗口中尽可能多地运行它，而不会同时运行两次迭代。如果不需要，我不想包含睡眠声明。我相信会发生什么是cURL会在完成最后一个请求时运行每个请求，如果我错了，那么我可以指示它这样做吗？

Answer 1

我需要多次运行此脚本一分钟，最好是一分钟多次，而不允许两个实例运行

你好运，因为我写了一个类来处理这样的事情。你可以在我的github上找到它

https://github.com/ArtisticPhoenix/MISC/blob/master/ProcLock.php

我还会在本文末尾复制完整的代码。

基本思想是创建一个文件，我将这个例子称为afile.lock。在此文件中记录PID或由cron运行的当前进程的进程ID。然后，当cron再次尝试运行该进程时，它会检查此锁定文件并查看是否正在运行正在使用此PID的PHP进程。

如果有更新文件的修改时间（并抛出异常）
如果没有，那么您可以自由地创建＆＃34;工人＆＃34;的新实例。

作为奖励，锁定文件的修改时间可以被脚本（我们正在跟踪其PID）用作在文件未更新的情况下关闭的方式，例如：如果cron停止，或者如果手动删除锁定文件，您可以设置为运行脚本将检测到此并自毁。

因此，不仅可以保持多个实例不运行，如果关闭cron，您可以告诉当前实例死亡。

基本用法如下。在启动＆＃34; worker＆＃34;

的cron文件中

//define a lock file (this is actually optional)
ProcLock::setLockFile(__DIR__.'/afile.lock');

try{
 //if you didn't set a lock file you can pass it in with this method call
  ProcLock::lock();
  //execute your process

}catch(\Exception $e){
    if($e->getCode() == ProcLock::ALREADY_LOCKED){
      //just exit or what have you
    }else{
      //some other exception happened.
    }
}

基本上很容易。

然后在运行过程中你可以经常检查（例如，如果你有一个运行某些东西的循环）

 $expires = 90; //1 1/2 minute (you may need a bit of fudge time)
 foreach($something as $a=>$b){
     $lastAccess = ProcLock::getLastAccess()
     if(false == $lastAccess  || $lastAccess + $expires < time()){
         //if last access is false (no lock file)
         //or last access + expiration, is less then the current time

         //log something like killed by lock timeout

         exit(); 
     }

 }

基本上这就是说，锁定文件是在进程运行时被删除的，或者cron在到期时间之前未能更新它。所以这里我们给它90秒，而cron应该每隔60秒更新一次锁文件。正如我所说，如果在调用lock()时发现锁定文件会自动更新，canLock()会调用true，如果它返回touch($lockfile)，则意味着我们可以锁定该进程，因为它当前不是锁定，然后运行register_shutdown_function更新mtime（修改时间）。

显然，如果主动检查访问和到期时间，你只能以这种方式自杀。

此脚本适用于Windows和Linux。在某些情况下，在Windows上，锁定文件不会被正确删除（有时在CMD窗口中按ctrl + c时），但是我已经非常努力确保不会发生这种情况，因此类文件包含一个自定义PHP脚本结束时运行的<?php /* (c) 2017 ArtisticPhoenix For license information please view the LICENSE file included with this source code GPL3.0. Proccess Locker ================================================================== This is a pseudo implementation of mutex since php does not have any thread synchronization objects This class uses files to provide locking functionality. Lock will be released in following cases 1 - user calls unlock 2 - when this lock object gets deleted 3 - when request or script ends 4 - when pid of lock does not match self::$_pid ================================================================== Only one Lock per Process! -note- when running in a browser typically all tabs will have the same PID so the locking will not be able to tell if it's the same process, to get around this run in CLI, or use 2 diffrent browsers, so the PID numbers are diffrent. This class is static for the simple fact that locking is done per-proces, so there is no need to ever have duplate ProcLocks within the same process --------------------------------------------------------------- */ final class { /** * exception code numbers * @var int */ const DIRECTORY_NOT_FOUND = 2000; const LOCK_FIRST = 2001; const FAILED_TO_UNLOCK = 2002; const FAILED_TO_LOCK = 2003; const ALREADY_LOCKED = 2004; const UNKNOWN_PID = 2005; const PROC_UNKNOWN_PID = 2006; /** * process _key * @var string */ protected static $_lockFile; /** * * @var int */ protected static $_pid; /** * No construction allowed */ private function __construct(){} /** * No clones allowed */ private function __clone(){} /** * globaly sets the lock file * @param string $lockFile */ public static function setLockFile( $lockFile ){ $dir = dirname( $lockFile ); if( !is_dir( dirname( $lockFile ))){ throw new Exception("Directory {$dir} not found", self::DIRECTORY_NOT_FOUND); //pid directroy invalid } self::$_lockFile = $lockFile; } /** * return global lockfile */ public static function getLockFile() { return ( self::$_lockFile ) ? self::$_lockFile : false; } /** * safe check for local or global lock file */ protected static function _chk_lock_file( $lockFile = null ){ if( !$lockFile && !self::$_lockFile ){ throw new Exception("Lock first", self::LOCK_FIRST); // }elseif( $lockFile ){ return $lockFile; }else{ return self::$_lockFile; } } /** * * @param string $lockFile */ public static function unlock( $lockFile = null ){ if( !self::$_pid ){ //no pid stored - not locked for this process return; } $lockFile = self::_chk_lock_file($lockFile); if(!file_exists($lockFile) || unlink($lockFile)){ return true; }else{ throw new Exception("Failed to unlock {$lockFile}", self::FAILED_TO_UNLOCK ); //no lock file exists to unlock or no permissions to delete file } } /** * * @param string $lockFile */ public static function lock( $lockFile = null ){ $lockFile = self::_chk_lock_file($lockFile); if( self::canLock( $lockFile )){ self::$_pid = getmypid(); if(!file_put_contents($lockFile, self::$_pid ) ){ throw new Exception("Failed to lock {$lockFile}", self::FAILED_TO_LOCK ); //no permission to create pid file } }else{ throw new Exception('Process is already running[ '.$lockFile.' ]', self::ALREADY_LOCKED );//there is a process running with this pid } } /** * * @param string $lockFile */ public static function getPidFromLockFile( $lockFile = null ){ $lockFile = self::_chk_lock_file($lockFile); if(!file_exists($lockFile) || !is_file($lockFile)){ return false; } $pid = file_get_contents($lockFile); return intval(trim($pid)); } /** * * @return number */ public static function getMyPid(){ return ( self::$_pid ) ? self::$_pid : false; } /** * * @param string $lockFile * @param string $myPid * @throws Exception */ public static function validatePid($lockFile = null, $myPid = false ){ $lockFile = self::_chk_lock_file($lockFile); if( !self::$_pid && !$myPid ){ throw new Exception('no pid supplied', self::UNKNOWN_PID ); //no stored or injected pid number }elseif( !$myPid ){ $myPid = self::$_pid; } return ( $myPid == self::getPidFromLockFile( $lockFile )); } /** * update the mtime of lock file * @param string $lockFile */ public static function canLock( $lockFile = null){ if( self::$_pid ){ throw new Exception("Process was already locked", self::ALREADY_LOCKED ); //process was already locked - call this only before locking } $lockFile = self::_chk_lock_file($lockFile); $pid = self::getPidFromLockFile( $lockFile ); if( !$pid ){ //if there is a not a pid then there is no lock file and it's ok to lock it return true; } //validate the pid in the existing file $valid = self::_validateProcess($pid); if( !$valid ){ //if it's not valid - delete the lock file if(unlink($lockFile)){ return true; }else{ throw new Exception("Failed to unlock {$lockFile}", self::FAILED_TO_UNLOCK ); //no lock file exists to unlock or no permissions to delete file } } //if there was a valid process running return false, we cannot lock it. //update the lock files mTime - this is usefull for a heartbeat, a periodic keepalive script. touch($lockFile); return false; } /** * * @param string $lockFile */ public static function getLastAccess( $lockFile = null ){ $lockFile = self::_chk_lock_file($lockFile); clearstatcache( $lockFile ); if( file_exists( $lockFile )){ return filemtime( $lockFile ); } return false; } /** * * @param int $pid */ protected static function _validateProcess( $pid ){ $task = false; $pid = intval($pid); if(stripos(php_uname('s'), 'win') > -1){ $task = shell_exec("tasklist /fi \"PID eq {$pid}\""); /* 'INFO: No tasks are running which match the specified criteria. ' */ /* ' Image Name PID Session Name Session# Mem Usage ========================= ======== ================ =========== ============ php.exe 5064 Console 1 64,516 K ' */ }else{ $cmd = "ps ".intval($pid); $task = shell_exec($cmd); /* ' PID TTY STAT TIME COMMAND ' */ } //print_rr( $task ); if($task){ return ( preg_match('/php|httpd/', $task) ) ? true : false; } throw new Exception("pid detection failed {$pid}", self::PROC_UNKNOWN_PID); //failed to parse the pid look up results //this has been tested on CentOs 5,6,7 and windows 7 and 10 } /** * destroy a lock ( safe unlock ) */ public static function destroy($lockFile = null){ try{ $lockFile = self::_chk_lock_file($lockFile); self::unlock( $lockFile ); }catch( Exception $e ){ //ignore errors here - this called from distruction so we dont care if it fails or succeeds //generally a new process will be able to tell if the pid is still in use so //this is just a cleanup process } } } /* * register our shutdown handler - if the script dies unlock the lock * this is superior to __destruct(), because the shutdown handler runs even in situation where PHP exhausts all memory */ register_shutdown_function(array('\\Lib\\Queue\\ProcLock',"destroy"));。

在浏览器中使用ProcLoc运行某些内容时请注意，无论其运行的选项卡如何，进程ID都将始终相同。因此，如果您打开一个进程锁定的选项卡，则打开另一个选项卡，进程锁定程序将它视为相同的过程并允许它再次锁定。要在浏览器中正确运行它并测试锁定，必须使用两个独立的浏览器（如crome和firefox）来完成。它并不打算在浏览器中运行，但这是我注意到的一个怪癖。

最后一点请注意，这个类是完全静态的，因为每个进程只能运行一个进程ID，这很明显。

棘手的部分是

确保在发生严重的PHP失败时处理锁定文件
确保另一个进程在从PHP中释放时没有获取pid号。这可以通过相对准确性来完成，因为我们可以判断PHP进程是否正在使用它，如果是这样我们假设它是我们需要的进程，重用PID很快就会出现在另一个进程中的可能性很小，甚至更少，这将是另一个PHP过程
在Linux和Windows上完成所有这些工作

幸运的是，我已经投入了足够的时间来做这些事情，这是我为我的工作制作的原始锁定脚本的更通用版本，我们已成功使用这种方式3年来保持控制从各种同步cron作业，从sFTP上传扫描，过期文件清理到无限期运行的RabbitMq消息工作者。

在任何情况下，这里都是完整的代码，享受。

{{1}}

Answer 2

preferably as many times in a minute that I can without allowing two instances to run at the same exact time or with any overlap. - 那么你根本不应该使用cronjob，你应该使用一个守护进程。但是如果你出于某种原因必须使用cronjob（例如，如果你在一个不允许守护进程的共享webhosting平台上），猜猜你可以使用sleep hack来每分钟运行几次相同的代码吗？

* * * * * /usr/bin/php /path/to/scriptA.php
* * * * * sleep 10; /usr/bin/php /path/to/scriptA.php
* * * * * sleep 20; /usr/bin/php /path/to/scriptA.php
* * * * * sleep 30; /usr/bin/php /path/to/scriptA.php
* * * * * sleep 40; /usr/bin/php /path/to/scriptA.php
* * * * * sleep 50; /usr/bin/php /path/to/scriptA.php

应该每隔10秒执行一次。

如果先前的执行尚未完成，确保它不会在并行运行，将其添加到scriptA的开头

call_user_func ( function () {
    static $lock;
    $lock = fopen ( __FILE__, "rb" );
    if (! flock ( $lock, LOCK_EX | LOCK_NB )) {
        // failed to get a lock, probably means another instance is already running
        die ();
    }
    register_shutdown_function ( function () use (&$lock) {
        flock ( $lock, LOCK_UN );
    } );
} );

如果另一个scriptA实例已在运行，它将会死亡（）。但是，如果你希望它等待上一次执行完成，而不是只是退出，那么删除LOCK_NB ......但这可能是危险的，如果每一次，或者甚至只是大多数执行使用超过10秒，你'将有越来越多的进程等待上一次执行完成，直到你用完RAM。

关于你的卷曲问题，

My question at this point is to how cURL performs when used in a loop, does this code trigger scriptA.php and THEN once it has finished loading it at that point start the next cURL request，这是正确的，curl等待页面完全加载，通常意味着整个scriptA已经完成。（如果你真的想要，你可以告诉scriptA使用fastcgi_finish_request()函数提前完成页面加载，但那是不寻常的）

Does the 3 second sleep even make a difference or will this literally run as fast as the time it takes each cURL request to complete - 是的，睡眠会使循环每次迭代减慢3秒。

My objective is to time this script and run it as many times as possible in a one minute window without two iterations of it being run at the same time - 然后让它成为一个永不退出的守护进程，而不是一个cronjob。

I don't want to include the sleep statement if it is not needed. - 不需要它。

I believe what happens is cURL will run each request upon finishing the last - 这是正确的。

PHP cURL时序问题

2 个答案: