Question

我花了好几个小时试图解决这个问题，看来Rebol似乎无法做到这一点。这是一个从网页下载所有图像的程序。很高兴看到我可以用更少的代码行编写它，但性能却很糟糕。下载4-5个文件后Rebol超时。通过在循环结束时添加wait 5来减少超时，但这需要太长时间！

一个相同的程序是用C语言编写的，它可以立即下载所有内容。以下是Rebol中下载图像的代码的一部分：

http://pastebin.com/fTnq8A3m

Answer 1

您在http://pastebin.com/fTnq8A3m

的脚本中有很多错误

例如你有

write ... read/binary ...

因此您将图像读取为二进制文件，然后将其作为文本写出来。当网址已作为网址存在时，您也将网址作为文本处理！数据类型。

所以在

read/binary join http://www.rebol.com/ %image.jpg

连接保持数据类型！完整。没有必要这样做

read/binary to-url join "http://www.rebol.com/" %image.jpg

这些图片的尺寸是多少？

添加等待5不会影响下载，因为您正在尝试阻止同步下载，并且由于您正在使用按钮，因此您将在VID内部，这意味着在等待中使用等待。 / p>

另一种方法是设置异步处理程序，然后开始下载，这样就不会像现在一样阻止GUI。

Answer 2

多年来，REBOL已经将REBOL用于商业应用程序，其中大多数需要网络，我可以肯定REBOL的网络非常稳定。实际上，它可以使服务器具有几个月的正常运行时间而没有任何内存泄漏。

但是因为你有一个非常具体的目标，我想我会制作一个小应用程序，向你展示如何完成和工作。

这绝对适用于R2。您可能遇到的一个问题是网络端口超时，但只有当您下载的服务器和/或映像每个需要几秒钟并且耗时超过30秒默认超时时才会出现这种情况。

下面的应用程序使用单个网址作为参数（您可以将其设置为顶部附近的任何内容），并且会在网页上下载所有＆lt; IMG＆gt; 网址。它支持http和https，我已经用维基百科，bing，google图像搜索这样的网站测试了它的效果非常好......每个服务器上的下载速率非常稳定。我在最小的gui上添加了速度报告，让你了解下载率。

请注意，这是一个同步应用程序，它只是下载一个图像列表...你不能简单地添加一个gui并期望它同时运行，因为这需要一个完全不同的网络模型（异步http端口），这需要更复杂的网络代码。

rebol [
    title: "webpage images downloader example"
    notes: "works with R2 only"
]

; the last page-url is the one to be used... feel free to change this
page-url: http://en.wikipedia.org/wiki/Dog
page-url: https://www.google.com/search?q=dogs&tbm=isch
page-url: http://www.bing.com/images/search?q=dogs&go=&qs=ds

;------
; automatically setup URL-based information
page-dir: copy/part page-url find/last/tail page-url "/"
page-host: copy/part page-url find/tail at page-url 8 "/"

?? page-url
?? page-dir
?? page-host

output-dir: %downloaded-images/  ; save images in a subdir of current-directory
unless exists? output-dir [make-dir output-dir ]

images: []

;------
; read url (expecting an HTML document)
;
; Parse is used to collect and cleanup URLs, make them absolute URLs. 
parse/all read page-url [
    some [
        thru {<img } thru {src="} copy image to {"} (
            case [
                "https://" = copy/part image 8 [image: to-url image]
                "http://" = copy/part image 7 [image: to-url image]
                "//" = copy/part image 2 [image: join  http:// at image 3  ]
                #"/" = pick image 1 [image: join page-host image ]
                'default [image: join page-dir image]
            ]
            append images image
         )
    ]
]

;------
; pretty-print image list
new-line/all images yes
probe images

;------
; display report window
view/new layout [ field-info: text 500 para [wrap?: false]   speed-info: text 500    ]

;------
; download images and report all activity
i: bytes: 0
s: now/precise
foreach image images [
    unless attempt [
        i: i + 1 
        probe image
        legal-chars: charset [#"a" - #"z" #"A" - #"Z" "0123456789-_.="] 
        fname: to-string find/last/tail image "/" ; get filename from url

        parse/all fname [some [ legal-chars | letter: skip  (change letter "-") ] ] ; convert illegal filename chars

        fname: join output-dir to-file fname ; use url filename to build disk path
        write/binary fname read/binary image ; download file

        ; update GUI
        t: difference now/precise s

        field-info/text: rejoin ["Downloading: (" i "/" length? images ") "  fname]
        show field-info

        bytes: bytes + size? fname
        speed-info/text: rejoin ["bytes: "  bytes ",   time: "  t   ",   speed : " (bytes / 1000) / ( to-decimal t) "kb/s"]
        show speed-info

        true ; all is good, attempt should return a value
    ][
        print "^/^/---^/unable to download image:"
        print image
        print "---^/^/"
    ]
]

如果您不需要网页扫描程序并且需要手动获取图像列表，只需将该代码替换为如下图像块：

images: [ 
    http://server.com/img1.png
    http://server.com/img2.png
    http://server.com/img3.png
]

让下载循环完成它的工作。

希望这有帮助

Answer 3

需要漫长的等待吗？在长循环中，rebol需要等待，然后处理gui事件，但IIRC等待0应该可以解决问题。事件排队是否有可能产生问题？

重新下载快速连续时重启

3 个答案: