F#如何根据谓词而不是固定长度来窗口化序列

时间:2017-09-24 12:06:28

标签: f#

给定以下输入序列,我想生成所需的输出。 我知道如果所有窗口都是固定长度的话,可以使用Seq.window几乎得到所需的结果。但是在这种情况下,它们并不是固定的,我希望每当" a"遇到了。 这是否可以使用标准集合库?

log_api(a,b)

5 个答案:

答案 0 :(得分:4)

这是一种使用可变状态但非常简洁的方法:

let mutable i = 0
[ for x in inputSequence do
    if x = "a" then i <- i + 1
    yield i, x ]
|> List.groupBy fst
|> List.map snd
|> List.map (List.map snd)

答案 1 :(得分:3)

正如在另一个答案中提到的,你可以使用递归或使用fold来相当容易地实现它。要使递归版本更有用,您可以定义一个函数chunkAt,当列表包含特定值时,该函数会创建一个新块:

let chunkAt start list = 
  let rec loop chunk chunks list = 
    match list with
    | [] -> List.rev ((List.rev chunk)::chunks)
    | x::xs when x = start && List.isEmpty chunk -> loop [x] chunks xs
    | x::xs when x = start -> loop [x] ((List.rev chunk)::chunks) xs
    | x::xs -> loop (x::chunk) chunks xs
  loop [] [] list

然后您可以使用以下命令在输入序列上运行它:

chunkAt "a" inputSequence

虽然没有标准库函数可以执行此操作,但您可以使用data series manipulation library Deedle,它实现了一组相当丰富的分块函数。要使用Deedle执行此操作,您可以将序列转换为按序数索引编制的序列,然后使用:

let s = Series.ofValues inputSequence
let chunked = s |> Series.chunkWhile (fun _ k2 -> s.[k2] <> "a")

如果您想将数据转回列表,可以使用返回系列的Values属性:

chunked.Values |> Seq.map (fun s -> s.Values)

答案 2 :(得分:1)

不幸的是,尽管它具有FP传统,但F#缺少一些常见的列表操作功能。基于谓词为1的拆分/分区。你可以使用递归来实现它,所以折叠。但是,如果您只是想应用库函数,那么这就是:

let inputSequence = 
      ["a"; "b"; "c";
       "a"; "b"; "c"; "d";
       "a"; "b"; 
       "a"; "d"; "f";
       "a"; "x"; "y"; "z"]

let aIdx = 
    inputSequence 
        |> List.mapi (fun i x -> i, x) //find the index of a elements
        |> List.filter (fun x -> snd x = "a")
        |> List.map fst //extract it into a list

[List.length inputSequence] 
    |> List.append aIdx //We will need the last "a" index, and the end of the list
    |> List.pairwise //begin and end index
    |> List.map (fun (x,y) -> inputSequence.[x .. (y - 1)]) 

//val it : string list list =
[["a"; "b"; "c"]; ["a"; "b"; "c"; "d"]; ["a"; "b"]; ["a"; "d"; "f"];
["a"; "x"; "y"; "z"]]

答案 3 :(得分:0)

这个答案与@TheQuickBrownFox提供的答案机制基本相同,但它并没有使用变量:

inputSequence 
|> List.scan (fun i x -> if x = "a" then i + 1 else i) 0 
|> List.tail
|> List.zip inputSequence 
|> List.groupBy snd
|> List.map (snd >> List.map fst)

如果您想使用库,除了@Tomas建议的库之外,F#+提供了一些基本的分割函数,可以像这样编写函数:

let splitEvery x = 
    List.split (seq [[x]]) >> Seq.map (List.cons x) >> Seq.tail >> Seq.toList

there is a proposal在F#核心中包含这些类型的函数,值得阅读讨论。

答案 4 :(得分:0)

这是一个简短的:

let folder (str: string) ((xs, xss): list<string> * list<list<string>>) =
    if str = "a" then ([], ((str :: xs) :: xss))
    else (str :: xs, xss)

List.foldBack folder inputSequence ([], [])
|> snd

// [["a"; "b"; "c"]; ["a"; "b"; "c"; "d"]; ["a"; "b"]; ["a"; "d"; "f"]; ["a"; "x"; "y"; "z"]]

这满足了问题中的规范:I would like to start a new sequence whenever "a" is encountered,因为在第一个&#34; a&#34;之前的任何初始字符串都是如此。将被忽略。例如,对于

let inputSequence = 
      ["r"; "s";
       "a"; "b"; "c";
       "a"; "b"; "c"; "d";
       "a"; "b"; 
       "a"; "d"; "f";
       "a"; "x"; "y"; "z"]

获得与上面相同的结果。

如果需要在第一个&#34; a&#34;之前捕获初始字符串。可以使用以下内容:

match inputSequence |> List.tryFindIndex (fun x -> x = "a") with
| None -> [inputSequence]
| Some i -> (List.take i inputSequence) :: 
            (List.foldBack folder (List.skip i inputSequence) ([], []) |> snd)

// [["r"; "s"]; ["a"; "b"; "c"]; ["a"; "b"; "c"; "d"]; ["a"; "b"];
   ["a"; "d"; "f"]; ["a"; "x"; "y"; "z"]]