Question

我需要在多个线程中按块下载文件块。例如，我有1k文件，每个文件~100Mb-1Gb，我只能通过块4096Kb下载这些文件（每个http获取请求只给我4kb）。

可能需要很长时间才能将它下载到一个线程中，所以我想下载它们，假设有20个线程（一个文件用于一个文件），我还需要在每个线程中下载几个块，同时

有没有显示这种逻辑的例子？

Answer 1

这是如何设置并发下载器的示例。需要注意的是带宽，内存和磁盘空间。您可以通过尝试多次尝试来消耗带宽，内存也是如此。你下载相当大的文件，所以内存可能是一个问题。另外需要注意的是，通过使用gorountines，您正在丢失请求顺序。因此，如果返回的字节的顺序很重要，那么这将不起作用，因为你必须知道最后组装文件的字节顺序，这意味着一次下载一个是最好的，除非你实现一种方式跟踪顺序（可能是某种全局映射[order int] []字节与互斥锁以防止竞争条件）。不涉及Go的替代方法（假设您有一台易用的unix机器）是使用Curl请参阅此处http://osxdaily.com/2014/02/13/download-with-curl/

package main

import (
    "bytes"
    "fmt"
    "io"
    "io/ioutil"
    "log"
    "net/http"
    "sync"
)

// now your going to have to be careful because you can potentially run out of memory downloading to many files at once..
// however here is an example that can be modded
func downloader(wg *sync.WaitGroup, sema chan struct{}, fileNum int, URL string) {
    sema <- struct{}{}
    defer func() {
        <-sema
        wg.Done()
    }()

    client := &http.Client{Timeout: 10}
    res, err := client.Get(URL)
    if err != nil {
        log.Fatal(err)
    }
    defer res.Body.Close()
    var buf bytes.Buffer
    // I'm copying to a buffer before writing it to file
    // I could also just use IO copy to write it to the file
    // directly and save memory by dumping to the disk directly.
    io.Copy(&buf, res.Body)
    // write the bytes to file
    ioutil.WriteFile(fmt.Sprintf("file%d.txt", fileNum), buf.Bytes(), 0644)
    return
}

func main() {
    links := []string{
        "url1",
        "url2", // etc...
    }
    var wg sync.WaitGroup
    // limit to four downloads at a time, this is called a semaphore
    limiter := make(chan struct{}, 4)
    for i, link := range links {
        wg.Add(1)
        go downloader(&wg, limiter, i, link)
    }
    wg.Wait()

}

在Golang的多个线程中按块下载文件

1 个答案: