F#使使用async
构建器轻松定义异步计算成为可能。您可以编写整个程序,然后将其传递给Async.RunSynchronously
。
我遇到的问题是某些async
动作不能同时运行。他们应被迫等待其他async
动作完成。这就像一个互斥锁。但是,我不想只是串行链接它们,因为这样效率低下。
具体示例:下载缓存
假设我想使用本地文件缓存来获取一些远程文件。在我的应用程序中,我在很多地方都调用fetchFile : Async<string>
,但是如果我同时在同一URL上调用fetchFile
,则存在风险,因为多次写入会损坏高速缓存。相反,fetchFile
命令应具有以下行为:
fetchFile
应该并行工作我正在想象某种有状态的DownloadManager
类,可以在内部将请求发送到该类并对其进行排序。
F#程序员通常如何使用async
实现这种逻辑?
虚构用法:
let dm = new DownloadManager()
let urls = [
"https://www.google.com";
"https://www.google.com";
"https://www.wikipedia.org";
"https://www.google.com";
"https://www.bing.com";
]
let results =
urls
|> Seq.map dm.Download
|> Async.Parallel
|> Async.RunSynchronously
注意:之前我曾问过this question如何以半平行的方式运行async
动作,但是现在我意识到这种方法很难组合。
注意:我不必担心一次运行该应用程序的多个实例。内存锁定就足够了。
答案 0 :(得分:3)
更新
Petricek建议使用比Async.StartChild
更好的值,因此我将lazyDownload
更改为asyncDownload
您可以将MailboxProcessor
用作处理缓存的下载管理器。 MailboxProcessor是F#中的一种结构,它处理消息队列以确保没有冲突。
首先,您需要一个能够维持状态的处理器:
let stateFull hndl initState =
MailboxProcessor.Start(fun inbox ->
let rec loop state : Async<unit> = async {
try let! f = inbox.Receive()
let! newState = f state
return! loop newState
with e -> return! loop (hndl e state)
}
loop initState
)
第一个参数是错误处理程序,第二个参数是初始状态,在这种情况下为Map<string, Async<string>>
。这是我们的downloadManager
:
let downloadManager =
stateFull (fun e s -> printfn "%A" e ; s) (Map.empty : Map<string, _>)
要调用邮箱,我们需要使用PostAndReply
:
let applyReplyS f (agent: MailboxProcessor<'a->Async<'a>>) =
agent.PostAndReply(fun (reply:AsyncReplyChannel<'r>) ->
fun v -> async {
let st, r = f v
reply.Reply r
return st
})
该函数需要一个文件夹函数来检查缓存,如果找不到则添加Async<string>
并返回更新的缓存。
首先使用asyncDownload
函数:
let asyncDownload url =
async {
let started = System.DateTime.UtcNow.Ticks
do! Async.Sleep 30
let finished = System.DateTime.UtcNow.Ticks
let r = sprintf "Downloaded %A it took: %dms %s" (started / 10000L) ((finished - started) / 10000L) url
printfn "%s" r
return r
}
只是一个伪函数,它返回字符串和计时信息。
现在使用文件夹功能检查缓存:
let folderCache url cache =
cache
|> Map.tryFind url
|> Option.map(fun ld -> cache, ld)
|> Option.defaultWith (fun () ->
let ld = asyncDownload url |> Async.StartChild |> Async.RunSynchronously
cache |> Map.add url ld, ld
)
最后我们的下载功能:
let downloadUrl url =
downloadManager
|> applyReplyS (folderCache url)
// val downloadUrl: url: string -> Async<string>
测试
let s = System.DateTime.UtcNow.Ticks
printfn "started %A" (s / 10000L)
let res =
List.init 50 (fun i -> i, downloadUrl (string <| i % 5) )
|> List.groupBy (snd >> Async.RunSynchronously)
|> List.map (fun (t, ts) -> sprintf "%s - %A" t (ts |> List.map fst ) )
let f = System.DateTime.UtcNow.Ticks
printfn "finish %A" (f / 10000L)
printfn "elapsed %dms" ((f - s) / 10000L)
res |> printfn "Result: \n%A"
产生以下输出:
started 63676683215256L
Downloaded 63676683215292L it took: 37ms "2"
Downloaded 63676683215292L it took: 36ms "3"
Downloaded 63676683215292L it took: 36ms "1"
Downloaded 63676683215291L it took: 38ms "0"
Downloaded 63676683215292L it took: 36ms "4"
finish 63676683215362L
elapsed 106ms
Result:
["Downloaded 63676683215291L it took: 38ms "0" - [0; 5; 10; 15; 20; 25; 30; 35; 40; 45]";
"Downloaded 63676683215292L it took: 36ms "1" - [1; 6; 11; 16; 21; 26; 31; 36; 41; 46]";
"Downloaded 63676683215292L it took: 37ms "2" - [2; 7; 12; 17; 22; 27; 32; 37; 42; 47]";
"Downloaded 63676683215292L it took: 36ms "3" - [3; 8; 13; 18; 23; 28; 33; 38; 43; 48]";
"Downloaded 63676683215292L it took: 36ms "4" - [4; 9; 14; 19; 24; 29; 34; 39; 44; 49]"]
答案 1 :(得分:3)
我同意@AMieres的观点,邮箱处理器是执行此操作的好方法。我的代码版本不太通用-为此目的直接使用邮箱处理器,因此可能会更简单。
我们的邮箱处理器只有一条消息-您要求它下载一个URL,它为您提供了一个异步工作流,您可以等待获取结果:
XmlNode
我们需要一个辅助函数来异步下载URL:
type DownloadMessage =
| Download of string * AsyncReplyChannel<Async<string>>
在邮箱处理器中,我们保留了一个可变的let asyncDownload url = async {
let wc = new System.Net.WebClient()
printfn "Downloading: %s" url
return! wc.AsyncDownloadString(System.Uri(url)) }
(这很好,因为邮箱处理器是同步处理消息的)。收到下载请求时,我们检查缓存中是否已经有下载-如果没有,则以子cache
的形式开始下载并将其添加到缓存中-因此缓存中包含代表以下结果的异步工作流程:正在下载。
async
要真正使用缓存下载,我们只向邮箱处理器发送一个请求,然后等待返回的工作流的结果(可能被多个请求共享)。
let downloadCache = MailboxProcessor.Start(fun inbox -> async {
let cache = System.Collections.Generic.Dictionary<_, _>()
while true do
let! (Download(url, repl)) = inbox.Receive()
if not (cache.ContainsKey url) then
let! proc = asyncDownload url |> Async.StartChild
cache.Add(url, proc)
repl.Reply(cache.[url]) })
答案 2 :(得分:2)
我为您提供了一个基于@Tomas Petricek答案的简化版本。
让我们假设我们具有下载功能,给定的URL返回Async<string>
。这是一个虚拟版本:
let asyncDownload url =
async {
let started = System.DateTime.UtcNow.Ticks
do! Async.Sleep 30
let finished = System.DateTime.UtcNow.Ticks
let r = sprintf "Downloaded %A it took: %dms %s" (started / 10000L) ((finished - started) / 10000L) url
printfn "%s" r
return r
}
在我们自己的模块中,我们有一些简单的通用Mailbox
帮助函数:
module Mailbox =
let iterA hndl f =
MailboxProcessor.Start(fun inbox ->
async {
while true do
try let! msg = inbox.Receive()
do! f msg
with e -> hndl e
}
)
let callA hndl f = iterA hndl (fun ((replyChannel: AsyncReplyChannel<_>), msg) -> async {
let! r = f msg
replyChannel.Reply r
})
let call hndl f = callA hndl (fun msg -> async { return f msg } )
此“库”的目的是简化MailboxProcessor
的更典型用法。尽管看起来很复杂且难以理解,但重要的是函数的功能以及如何使用它们。
特别是,我们将使用Mailbox.call
来返回能够返回值的邮箱代理。它的签名是:
val call:
hndl: exn -> unit ->
f : 'a -> 'b
-> MailboxProcessor<AsyncReplyChannel<'b> * 'a>
第一个参数是异常处理程序,第二个参数是返回值的函数。这是我们定义downloadManager
的方式:
let downloadManager =
let dict = new System.Collections.Generic.Dictionary<string, _>()
Mailbox.call (printfn "%A") (fun url ->
if dict.ContainsKey url then dict.[url] else
let result = asyncDownload url |> Async.StartChild |> Async.RunSynchronously
dict.Add(url, result)
result
)
我们的缓存为Dictionary
。如果没有网址,我们将调用asyncDownload
并将其作为子进程启动。通过使用Async.StartChild
,我们不必等到下载完成,只需返回一个async
,等待它完成。
要调用管理器,我们使用downloadManager.PostAndReply
let downloadUrl url = downloadManager.PostAndReply(fun reply -> reply, url)
这是一个测试:
let s = System.DateTime.UtcNow.Ticks
printfn "started %A" (s / 10000L)
let res =
List.init 50 (fun i -> i, downloadUrl (string <| i % 5) )
|> List.groupBy (snd >> Async.RunSynchronously)
|> List.map (fun (t, ts) -> sprintf "%s - %A" t (ts |> List.map fst ) )
let f = System.DateTime.UtcNow.Ticks
printfn "finish %A" (f / 10000L)
printfn "elapsed %dms" ((f - s) / 10000L)
res |> printfn "Result: \n%A"
产生:
started 63676682503885L
Downloaded 63676682503911L it took: 34ms 1
Downloaded 63676682503912L it took: 33ms 2
Downloaded 63676682503911L it took: 37ms 0
Downloaded 63676682503912L it took: 33ms 3
Downloaded 63676682503912L it took: 33ms 4
finish 63676682503994L
elapsed 109ms
Result:
["Downloaded 63676682503911L it took: 37ms 0 - [0; 5; 10; 15; 20; 25; 30; 35; 40; 45]";
"Downloaded 63676682503911L it took: 34ms 1 - [1; 6; 11; 16; 21; 26; 31; 36; 41; 46]";
"Downloaded 63676682503912L it took: 33ms 2 - [2; 7; 12; 17; 22; 27; 32; 37; 42; 47]";
"Downloaded 63676682503912L it took: 33ms 3 - [3; 8; 13; 18; 23; 28; 33; 38; 43; 48]";
"Downloaded 63676682503912L it took: 33ms 4 - [4; 9; 14; 19; 24; 29; 34; 39; 44; 49]"]