Mesos / Marathon检查点和HA

时间:2015-03-19 12:37:13

标签: mesos mesosphere marathon

Mesos和Marathon不时提到检查点,但我无法找到它在任何地方如何运作的良好解释。另外,这在实践中意味着什么?

1) Is the Task current state continuously being stored, or is only the Task ID stored? Where is it stored and what does it contain?
2) There are two Marathon instances. Marathon has been running Nginx for a week, then goes down. Does that mean that the actual Nginx application state continues running on the second Marathon instance, or does it just restart the task from beginning? If the Task actual state is copied, isn't there a lot of data to be continuously persisted and passed around between slaves? 

1 个答案:

答案 0 :(得分:1)


从属恢复是Mesos的一项功能,它允许:

  • 执行者/任务在从属进程关闭时继续运行
  • 允许重新启动的从属进程重新连接从属服务器上运行的执行程序/任务。Mesos Slave recovery)。

关于你的问题,这意味着:

  1. 存储足够的信息(比TaskID多一点),以便新的从属进程可以重新连接到仍在运行的执行程序/任务。

  2. 由于任务状态未检查点,因此它将从头开始执行任务。

  3. 希望这有帮助, 约尔格