卡夫卡复制品滞后澄清

时间:2018-02-01 19:52:29

标签: apache-kafka

kafka的文档说明如下:

  

仍然从领导者那里获取消息但却没有获取消息的副本   赶上replica.lag.time.max.ms中的最新消息   被认为不同步。

我不确定这究竟是什么意思。

  1. 一旦每个replica.lag.time.max.ms被认为是同步的,副本需要在0消息之后
  2. 或者副本提取的最新消息不应超过 replica.lag.time.max.ms
  3. 这两个定义不是一回事,因为如果它意味着#2,副本总是可能有2或3个消息,但只要它的漂移不超过replica.lag.time,它仍然保持同步.max.ms。

    但如果它意味着#1,则副本需要比数据到达时更快地消耗。

2 个答案:

答案 0 :(得分:1)

它的数字2.如果没有比没有被复制的领导者的延迟时间更早的数据,则副本处于同步状态。如果您认为应该更新措辞,请打开一个jira,因为这很容易更新:)

答案 1 :(得分:0)

我认为它更接近#1,但不完全是。我粘贴一些源代码来帮助您。源代码的版本是1.0.2。

Replica通过Partition.getOutOfSyncReplicas(leaderReplica:副本,maxLagMs:Long)不同步:

def getOutOfSyncReplicas(leaderReplica: Replica, maxLagMs: Long): Set[Replica] = {
/**
 * there are two cases that will be handled here -
 * 1. Stuck followers: If the leo of the replica hasn't been updated for maxLagMs ms,
 *                     the follower is stuck and should be removed from the ISR
 * 2. Slow followers: If the replica has not read up to the leo within the last maxLagMs ms,
 *                    then the follower is lagging and should be removed from the ISR
 * Both these cases are handled by checking the lastCaughtUpTimeMs which represents
 * the last time when the replica was fully caught up. If either of the above conditions
 * is violated, that replica is considered to be out of sync
 *
 **/
val candidateReplicas = inSyncReplicas - leaderReplica

val laggingReplicas = candidateReplicas.filter(r => (time.milliseconds - r.lastCaughtUpTimeMs) > maxLagMs)
if (laggingReplicas.nonEmpty)
  debug("Lagging replicas are %s".format(laggingReplicas.map(_.brokerId).mkString(",")))

laggingReplicas

}

Replica.lastCaughtUpTimeMs由Replica.updateLogReadResult(logReadResult:LogReadResult)更新:

/**
* If the FetchRequest reads up to the log end offset of the leader when the current fetch request is received,
* set `lastCaughtUpTimeMs` to the time when the current fetch request was received.
*
* Else if the FetchRequest reads up to the log end offset of the leader when the previous fetch request was received,
* set `lastCaughtUpTimeMs` to the time when the previous fetch request was received.
*
* This is needed to enforce the semantics of ISR, i.e. a replica is in ISR if and only if it lags behind leader's LEO
* by at most `replicaLagTimeMaxMs`. These semantics allow a follower to be added to the ISR even if the offset of its
* fetch request is always smaller than the leader's LEO, which can happen if small produce requests are received at
* high frequency.
**/
def updateLogReadResult(logReadResult: LogReadResult) {
if (logReadResult.info.fetchOffsetMetadata.messageOffset >= logReadResult.leaderLogEndOffset)
  _lastCaughtUpTimeMs = math.max(_lastCaughtUpTimeMs, logReadResult.fetchTimeMs)
else if (logReadResult.info.fetchOffsetMetadata.messageOffset >= lastFetchLeaderLogEndOffset)
  _lastCaughtUpTimeMs = math.max(_lastCaughtUpTimeMs, lastFetchTimeMs)

logStartOffset = logReadResult.followerLogStartOffset
logEndOffset = logReadResult.info.fetchOffsetMetadata
lastFetchLeaderLogEndOffset = logReadResult.leaderLogEndOffset
lastFetchTimeMs = logReadResult.fetchTimeMs
}