我正在使用文件源流组件从目录中读取文件,并将File实例发送到自定义处理器,该处理器读取文件并使用TaskLauncher接收器启动特定任务。如果我在目录中删除5个文件,则同时启动5个任务。我想要实现的是让每个Task一个接一个地执行,所以我需要监视任务的状态以确保在启动另一个任务之前完成先前的任务。我有什么选择来实现这个?作为旁注,我在Yarn集群上运行它。
谢谢,
-Frank
答案 0 :(得分:0)
我认为YARN TaskLauncher启动异步任务可能是让它看起来像是同时启动所有任务的原因。您可以尝试的一种可能方法是让custom
任务启动器sink
launches
任务waits
,任务状态为completed
,然后才能启动 > data <- "Well, um...such a personal topic. No wonder I am the first to write a review. Suffice to say this stuff does just what they claim and tastes pleasant. And I had, well, major problems in this area and now I don't. 'Nuff said. :-)"
> ?"regular expression"
> strsplit(data, "(?<=[^.][.][^.])", perl=TRUE)
[[1]]
[1] "Well, um...such a personal topic. "
[2] "No wonder I am the first to write a review. "
[3] "Suffice to say this stuff does just what they claim and tastes pleasant. "
[4] "And I had, well, major problems in this area and now I don't. "
[5] "'Nuff said. "
[6] ":-)"
处理下一个触发请求。