我写了一个快速而又脏的函数来比较文件内容(BTW,我已经测试过它们的大小相同):
let eqFiles f1 f2 =
let bytes1 = Seq.ofArray (File.ReadAllBytes f1)
let bytes2 = Seq.ofArray (File.ReadAllBytes f2)
let res = Seq.compareWith (fun x y -> (int x) - (int y)) bytes1 bytes2
res = 0
我不满意将所有内容读入数组。我宁愿有一个懒惰的字节序列,但我在F#中找不到合适的API。
答案 0 :(得分:9)
如果您想使用F#的全部功能,那么您也可以异步执行。我们的想法是,您可以从两个文件中异步读取指定大小的块,然后比较块(使用标准和简单的字节数组比较)。
这实际上是一个有趣的问题,因为您需要生成类似异步序列(按需生成的Async<T>
值序列,但不像简单seq<T>
或迭代那样阻塞线程的内容)。读取异步序列的数据和声明的函数可能如下所示:
编辑我还将代码段发布到http://fssnip.net/1k,其中包含更好的F#格式: - )
open System.IO
/// Represents a sequence of values 'T where items
/// are generated asynchronously on-demand
type AsyncSeq<'T> = Async<AsyncSeqInner<'T>>
and AsyncSeqInner<'T> =
| Ended
| Item of 'T * AsyncSeq<'T>
/// Read file 'fn' in blocks of size 'size'
/// (returns on-demand asynchronous sequence)
let readInBlocks fn size = async {
let stream = File.OpenRead(fn)
let buffer = Array.zeroCreate size
/// Returns next block as 'Item' of async seq
let rec nextBlock() = async {
let! count = stream.AsyncRead(buffer, 0, size)
if count > 0 then return Ended
else
// Create buffer with the right size
let res =
if count = size then buffer
else buffer |> Seq.take count |> Array.ofSeq
return Item(res, nextBlock()) }
return! nextBlock() }
进行比较的异步工作流非常简单:
let rec compareBlocks seq1 seq2 = async {
let! item1 = seq1
let! item2 = seq1
match item1, item2 with
| Item(b1, ns1), Item(b2, ns2) when b1 <> b2 -> return false
| Item(b1, ns1), Item(b2, ns2) -> return! compareBlocks ns1 ns2
| Ended, Ended -> return true
| _ -> return failwith "Size doesn't match" }
let s1 = readInBlocks "f1" 1000
let s2 = readInBlocks "f2" 1000
compareBlocks s1 s2
答案 1 :(得分:6)
如果在此过程中存在差异,这将比较字节和快捷方式的文件字节。它还将处理不同的文件大小
let rec compareFiles (fs1: FileStream) (fs2: FileStream) =
match fs1.ReadByte(),fs2.ReadByte() with
| -1,-1 -> true //all bytes have been enumerated and were all equal
| _,-1 -> false //the files are of different length
| -1,_ -> false //the files are of different length
| x,y when x <> y -> false
//only continue to the next bytes when the present two are equal
| _ -> compareFiles fs1 fs2
答案 2 :(得分:1)
你必须流式传输文件,只需按块浏览它们,但.Net中的File and Stream(and it's descendants like StreamReader and so )
库可以满足你的需求。
答案 3 :(得分:1)
正如其他人已经说过的那样,使用流来进行惰性I / O,例如
open System
let seqOfFstream (fstream: IO.FileStream) = seq {
let currentByte = ref 0
while !currentByte >= 0 do
currentByte := fstream.ReadByte()
yield !currentByte
}
let fileEq fname1 fname2 =
use f1 = IO.File.OpenRead fname1
use f2 = IO.File.OpenRead fname2
not (Seq.exists2 (fun a b -> a <> b) (seqOfFstream f1) (seqOfFstream f2))
答案 4 :(得分:0)
你不需要F#中的任何新东西 - 我只是定义一个序列,使用下面的FileStream
代替使用File.ReadAllBytes
来产生字节。然后你可以比较两个这样的序列“F#way”。
答案 5 :(得分:0)
调整Tomas Petricek接受的答案。您问过哪里关闭了小溪?他们不是。在我的情况下,导致句柄泄漏并共享验证问题。我通过更改readInBlocks函数的签名将打开和关闭流的责任转移到调用方法来解决了此问题 来自:
let readInBlocks fn size =
[...]
收件人:
let readInBlocks (stream:FileStream) size =
[...]
然后,compare-file方法负责处理流:
let compareFile (filePath1, filePath2) =
use stream1 = File.OpenRead(filePath1)
use stream2 = File.OpenRead(filePath2)
let s1 = readInBlocks stream1 1000
let s2 = readInBlocks stream2 1000
let isEqual =
compareBlocks s1 s2
|> Async.RunSynchronously
isEqual
完整的调整后代码:
open System.IO
/// Represents a sequence of values 'T where items
/// are generated asynchronously on-demand
type AsyncSeq<'T> = Async<AsyncSeqInner<'T>>
and AsyncSeqInner<'T> =
| Ended
| Item of 'T * AsyncSeq<'T>
/// Read file 'fn' in blocks of size 'size'
/// (returns on-demand asynchronous sequence)
let readInBlocks (stream:FileStream) size =
async {
let buffer = Array.zeroCreate size
/// Returns next block as 'Item' of async seq
let rec nextBlock() =
async {
let! count = stream.AsyncRead(buffer, 0, size)
if count = 0 then return Ended
else
// Create buffer with the right size
let res =
if count = size then buffer
else buffer |> Seq.take count |> Array.ofSeq
return Item(res, nextBlock())
}
return! nextBlock()
}
/// Asynchronous function that compares two asynchronous sequences
/// item by item. If an item doesn't match, 'false' is returned
/// immediately without generating the rest of the sequence. If the
/// lengths don't match, exception is thrown.
let rec compareBlocks seq1 seq2 = async {
let! item1 = seq1
let! item2 = seq2
match item1, item2 with
| Item(b1, ns1), Item(b2, ns2) when b1 <> b2 -> return false
| Item(b1, ns1), Item(b2, ns2) -> return! compareBlocks ns1 ns2
| Ended, Ended -> return true
| _ -> return failwith "Size doesn't match" }
/// Compare two files using 1k blocks
let compareFile (filePath1, filePath2) =
use stream1 = File.OpenRead(filePath1)
use stream2 = File.OpenRead(filePath2)
let s1 = readInBlocks stream1 1000
let s2 = readInBlocks stream2 1000
let isEqual =
compareBlocks s1 s2
|> Async.RunSynchronously
isEqual