Question

我们正在Alfresco（基于Java的）创建一个Web脚本。在某个时间间隔，这个webscript应该从远程系统下载文件（大量文件）并处理它们并在Alfresco中对文件进行版本化。

现在，此Web脚本将从Jenkin框中触发，因此我们计划轮询此Web脚本状态是否已完成，直到整个过程完成该过程。这将经常发生，每天或每周说。

如何让webscript定期向Jenkin工作发送中间响应并继续处理。完成所有进程后，相同的webscript调用应将完成状态发送到jenkin框。

我怎样才能做到这一点？

注意：我不能使用Cron。只能使用Jenkin作为webscript的输入，将从Jenkin发送（从不同的产品收到）。

Answer 1

下面我将介绍如何在Alfresco中实现批处理。在详细介绍之前，我还建议将此过程与Activiti工作流程（或者您喜欢的JBPM）集成。

如稍后所述，该过程将发送事件以通知侦听器作业的进度。这些事件的监听者可以直接致电詹金斯。

侦听器可以更新工作流，而不是直接调用Jenkins。在这种情况下，调用Jenkins的逻辑将在工作流任务中实现。这使得更容易将批处理的逻辑与通知程序的逻辑分开。此外，工作流程还可以用于存储关于作业进度的信息。这些信息最终可以被感兴趣的人/事物调查。

漫长的过程：

我不知道您使用的是哪个版本的Alfresco，我将描述4.1版的解决方案。 Alfresco支持长期运行的批处理过程，主要是包org.alfresco.repo.batch中的类和接口：

BatchProcessWorkProvider

BatchProcessor

BatchProcessor.BatchProcessWorker

BatchMonitor

BatchMonitorEvent.java

您需要为两个接口提供实现：BatchProcessorWorkProvider和BatchProcessor.BatchProcessWorker：

两个接口都附在下面。第一个返回工作负载，第二个定义工人是什么。

<强> BatchProcessor ：

/**
 * An interface that provides work loads to the {@link BatchProcessor}.
 * 
 * @author Derek Hulley
 * @since 3.4
 */
public interface BatchProcessWorkProvider<T>
{
    /**
     * Get an estimate of the total number of objects that will be provided by this instance.
     * Instances can provide accurate answers on each call, but only if the answer can be
     * provided quickly and efficiently; usually it is enough to to cache the result after
     * providing an initial estimate.
     * 
     * @return                  a total work size estimate
     */
    int getTotalEstimatedWorkSize();

    /**
     * Get the next lot of work for the batch processor.  Implementations should return
     * the largest number of entries possible; the {@link BatchProcessor} will keep calling
     * this method until it has enough work for the individual worker threads to process
     * or until the work load is empty.
     * 
     * @return                  the next set of work object to process or an empty collection
     *                          if there is no more work remaining.
     */
    Collection<T> getNextWork();
}

<强> BatchProcessWorker ：

/**
 * An interface for workers to be invoked by the {@link BatchProcessor}.
 */
public interface BatchProcessWorker<T>
{
    /**
     * Gets an identifier for the given entry (for monitoring / logging purposes).
     * 
     * @param entry
     *            the entry
     * @return the identifier
     */
    public String getIdentifier(T entry);

    /**
     * Callback to allow thread initialization before the work entries are
     * {@link #process(Object) processed}.  Typically, this will include authenticating
     * as a valid user and disbling or enabling any system flags that might affect the
     * entry processing.
     */
    public void beforeProcess() throws Throwable;

    /**
     * Processes the given entry.
     * 
     * @param entry
     *            the entry
     * @throws Throwable
     *             on any error
     */
    public void process(T entry) throws Throwable;

    /**
     * Callback to allow thread cleanup after the work entries have been
     * {@link #process(Object) processed}.
     * Typically, this will involve cleanup of authentication and resetting any
     * system flags previously set.
     * <p/>
     * This call is made regardless of the outcome of the entry processing.
     */
    public void afterProcess() throws Throwable;
}

实际上，BatchProcessWorkProvider返回“要做的工作”（“T”类）的集合。 “要做的工作”是您需要提供的课程。在您的情况下，此类可以提供从远程系统提取文件子集的信息。方法过程将使用此信息来实际完成工作。举个例子，我们可以调用T，ImportFiles。

您的BatchProcessWorkProvider应该将文件列表分成合理大小的ImportFiles集合。

BatchProcessWorker中“最重要”的方法是

public void process(ImportFiles filesToImport) throws Throwable;

这是您必须实施的方法。对于其他方法，有一个适配器BatchProcess.BatchProcessWorkerAdapter，它提供了一个默认实现。

流程方法接收ImportFiles作为参数，并可以使用它来查找远程服务器中的文件并导入它们。

最后，您需要实例化BatchProcessor：

try {
    final RetryingTransactionHelper retryingTransactionHelper = transactionService.getRetryingTransactionHelper();
    BatchProcessor<ImportFiles> batchProcessor = new BatchProcessor<ImportFiles>(processName,
            retryingTransactionHelper, workProvider, threads, batchSize,
            applicationEventPublisher, logger, loggingInterval);
    batchProcessor.process(worker, true);
} 
catch (LockAcquisitionException e) {
    /* Manage exception */
}

其中

processName：长时间运行过程的描述

workProvider BatchProcessWorkProvider的实例

threads：工作线程数（并行）

batchSize：在同一事务中处理的条目数

logger：用于报告进度的记录器

loggingInterval：报告进度之前要处理的条目数

retryingTransactionHelper：是在并发更新（乐观锁定）或死锁条件失败时重试事务的辅助类。

applicationEventPublisher：这是Spring ApplicationEventPublisher的一个实例，它通常（也适用于Alfresco）Spring ApplicationContext。

要将事件发送到Jenkins，您可以使用applicationEventPublisher。以下链接介绍了如何使用它。它是Spring的标准功能。

Spring events

事件可以是，例如通过方法

发送

process(ImportFiles filesToImport)

如上所述。

Answer 2

我不会争论你选择一个webscript来实现你的逻辑，虽然我不是百分之百的好。

至于您的问题，您可以将作业/逻辑执行的整体进度状态存储在某些单例中，并使用其他wesbcript（或者只是具有不同参数的wesbcript）为您返回该值。

Alfresco - 基于Java的Webscript - 异步发送响应

2 个答案: