在获取期间开始提取,排队中间请求,然后为所有提供数据

时间:2017-01-06 08:56:40

标签: elixir phoenix-framework otp

使用Elixir和Phoenix实现以下流程时遇到问题:

    来自用户A的
  1. 请求,第三方API缓存为空
  2. 通过HTTP启动第三方 API抓取
    1. 提取尚未完成,用户B的请求进入
    2. 用户B 等待以获取完成
  3. 提取完成,将提取的数据写入缓存(例如Redis)
  4. 为所有等待用户提供缓存数据
  5. 不同的路由或路由参数应使用不同的队列。在第三方API数据仍在获取时进入的请求在任何情况下都不应触发具有相同参数的其他提取。等待部分(2.2。)对我来说至关重要。

    从我到目前为止阅读的内容来看,使用标准的Elixir / Erlang / OTP功能似乎可以解决这个问题。

1 个答案:

答案 0 :(得分:4)

是的,与大多数其他语言相比,这可以在Elixir / Erlang中轻松完成。这是在内存缓存中执行此操作的一种方法。如果您之前使用过GenServer而不是GenServer.reply/2,请注意的是我们存储了传入from请求的handle_call参数,当请求完成时,我们会回复每个他们我没有在这个POC代码中以一种好的方式处理错误,但它正确地处理了最有趣的部分,即2.2,

defmodule CachedParallelHTTP do
  def start_link do
    GenServer.start_link(__MODULE__, :ok)
  end

  def init(_) do
    {:ok, %{}}
  end

  def handle_call({:fetch, arg}, from, state) do
    case state[arg] do
      %{status: :fetched, response: response} ->
        # We've already made this request; just return the cached response.
        {:reply, response, state}
      %{status: :fetching} ->
        # We're currently running this request. Store the `from` and reply to the caller later.
        state = update_in(state, [arg, :froms], fn froms -> [from | froms] end)
        {:noreply, state}
      nil ->
        # This is a brand new request. Let's create the new state and start the request.
        pid = self()
        state = Map.put(state, arg, %{status: :fetching, froms: [from]})
        Task.start(fn ->
          IO.inspect {:making_request, arg}
          # Simulate a long synchronous piece of code. The actual HTTP call should be made here.
          Process.sleep(2000)
          # dummy response
          response = arg <> arg <> arg
          # Let the server know that this request is done so it can reply to all the `froms`,
          # including the ones that were added while this request was being executed.
          GenServer.call(pid, {:fetched, arg, response})
        end)
        {:noreply, state}
    end
  end

  def handle_call({:fetched, arg, response}, _from, state) do
    # A request was completed.
    case state[arg] do
      %{status: :fetching, froms: froms} ->
        IO.inspect "notifying #{length(froms)} clients waiting for #{arg}"
        # Reply to all the callers who've been waiting for this request.
        for from <- froms do
          GenServer.reply(from, response)
        end
        # Cache the response in the state, for future callers.
        state = Map.put(state, arg, %{status: :fetched, response: response})
        {:reply, :ok, state}
    end
  end
end

这里有一小段代码来测试这个:

now = fn -> DateTime.utc_now |> DateTime.to_iso8601 end

{:ok, s} = CachedParallelHTTP.start_link
IO.inspect {:before_request, now.()}
for i <- 1..3 do
  Task.start(fn ->
    response = GenServer.call(s, {:fetch, "123"})
    IO.inspect {:response, "123", i, now.(), response}
  end)
end
:timer.sleep(1000)
for i <- 1..5 do
  Task.start(fn ->
    response = GenServer.call(s, {:fetch, "456"})
    IO.inspect {:response, "456", i, now.(), response}
  end)
end
IO.inspect {:after_request, now.()}
:timer.sleep(10000)

输出:

{:before_request, "2017-01-06T10:30:07.852986Z"}
{:making_request, "123"}
{:after_request, "2017-01-06T10:30:08.862425Z"}
{:making_request, "456"}
"notifying 3 clients waiting for 123"
{:response, "123", 3, "2017-01-06T10:30:07.860758Z", "123123123"}
{:response, "123", 2, "2017-01-06T10:30:07.860747Z", "123123123"}
{:response, "123", 1, "2017-01-06T10:30:07.860721Z", "123123123"}
"notifying 5 clients waiting for 456"
{:response, "456", 5, "2017-01-06T10:30:08.862556Z", "456456456"}
{:response, "456", 4, "2017-01-06T10:30:08.862540Z", "456456456"}
{:response, "456", 3, "2017-01-06T10:30:08.862524Z", "456456456"}
{:response, "456", 2, "2017-01-06T10:30:08.862504Z", "456456456"}
{:response, "456", 1, "2017-01-06T10:30:08.862472Z", "456456456"}

请注意,使用GenServer.replyTask.start,单个GenServer能够处理多个并行请求,同时保持面向API的用户完全同步。根据您要处理的负载量,您可能需要考虑使用GenServers池。