Question

我正在从n个服务器读取文件，我希望每个人都下载该文件的1 / n。我认为一些快速整数数学可以工作，但它似乎并不总是有效：

threads = n
thread_id = 0:n-1
filesize (in bytes) = x

starting position = thread_id*(filesize/threads)
bytes to read = (filesize/threads)

有时对于正确的数字，比如一个26字节的文件被9个线程划分（我知道这很荒谬，但仅举例），它对我有利。肯定有更好的办法。有什么想法吗？

Answer 1

在我看来，唯一缺少的是最后一个线程（线程n-1）必须读取到文件的末尾以获取'模数'字节 - 剩下的字节除以{{ 1}}。基本上是：

threads

或者，您可以通过在每个线程上添加1个字节到bytes_to_read，在第一个bytes_to_read = (thread_id == n - 1) ? filesize / threads + filesize % threads : filesize / threads线程上分配额外的工作 - 当然，您必须调整起始位置。

Answer 2

你必须做类似的事情：

starting position = thread_id * floor(filesize / threads)
bytes to read = floor(filesize / threads) if thread_id != threads-1
bytes to read = filesize - (threads-1)*floor(filesize / threads) if thread_id = threads - 1

Answer 3

要精确读取每个字节一次，一致地计算开始和结束位置，然后减去以获得字节数：

start_position = thread_id * file_size / n
end_position = (thread_id + 1) * file_size / n
bytes_to_read = end_position - start_position

请注意，我们会仔细选择位置表达式，以便在end_position == file_size时为您提供thread_id == n-1。如果您执行其他操作，例如thread_id * (file_size/n)，则需要将此视为特殊情况，如@wuputah所说。

用整数数学分割文件

3 个答案: