Question

我们正在使用kakfa stream 1.1.0版本

我们对kafka流代码进行了一些更改。我们正在生产环境中观察以下事件序列。我们想了解在1.1.0版本中是否也可以遵循以下事件顺序。

时间T0

StreamThread-1 : got assigned 0_1, 0_2 standby tasks
StreamThread-2 : got assigned 0_3 standby task

时间T1

Now let us say there is a consumer group rebalance.

And task 0_1 got assigned to StreamThread-2 (i.e; it 0_1 standby task moved from StreamThread-1 to StreamThread-2).

时间T2

StreamThread-2 sees that new standby task, 0_1, is assigned to it. 
Tries to initializeStateStores for 0_1, but gets *LockException* because *owningThread* for the lock is StreamThread-1.

But LockException is being swallowed in *initializeNewTasks* function of *AssignedTasks.java*

And 0_1 remains in *created* map inside *AssignedTasks.java*

时间T3

StreamThread-1 realizes that 0_1 is not re-assigned to it and closes the suspended task. 
As part of closing suspended task, entry for 0_1 is deleted from *locks* map in *unlock* function in StateDirectory.java

时间T4

 *CleanupThread* came along after *cleanupDelayMs* time and decided 0_1 directory in local 
 file system is obsolete and deleted the directory !!!

由于已删除任务的本地目录，并且在创建的映射下创建了0_1 ，所以直到下一次重新平衡时，才会读取0_1备用任务的changelog主题分区!!!

请告知我们这是否有效。如果没有，那么防止这种情况发生的措施是什么。

我们看到在https://issues.apache.org/jira/browse/KAFKA-6122中，删除了有关锁的重试。请让我们知道为什么删除重试机制？

我们也在https://github.com/apache/kafka/pull/3653中看到了关于吞下LockException并重试不存在的讨论。 dguy 回答说：“重试不在此代码块中发生。它将在下一次runLoop执行时发生。”

但是线程状态已更改为RUNNING，因此不会在 StreamThread

的 runOnce 中再次调用updateNewAndRestoringTasks

最后，在TaskManager＃updateNewAndRestoringTasks中，出现IF条件，该条件检查所有活动任务是否都在运行。

我们应该改变吗来自

if（active.allTasksRunning（））{...}

到

if（active.allTasksRunning（）&& standby.allTasksRunning（））{...}

??

备用任务保留在AssignedTasks中的“创建的”哈希图中

0 个答案: