Question

我需要截取许多主机上的数千个网址。

我可以从命令行使用lib，但是如何将它集成到我的代码中，以便我可以同时拍摄多个屏幕截图？

我认为这与xvfb有关，就像这个问题的答案一样：How to kill headless X server started via Python?但我不确定究竟是什么。

Answer 1

可能是这样的（未经测试）：

from webkit2png import WebkitRenderer, init_qtgui
from PyQt4.QtCore import QTimer

def renderer_func():   
    renderer = WebkitRenderer()
    renderer.width = 800
    renderer.height = 600
    renderer.timeout = 10
    renderer.wait = 1
    renderer.format = "png"
    renderer.grabWholeWindow = False

    outfile = open("stackoverflow.png", "w")
    renderer.render_to_file(url="http://stackoverflow.com", file=outfile)
    outfile.close()

app = init_qtgui()
QTimer.singleShot(0, renderer_func)
sys.exit(app.exec_())

这是从source code of webkit2png.py无耻地撕掉的。

Answer 2

我使用subprocess来呼叫webkit2png（通过python-webkit2png安装），它运作良好。

def scrape_url(url, outpath):
    """
    Requires webkit2png to be on the path
    """
    subprocess.call(["webkit2png", "-o", outpath, "-g", "1000", "1260",
                     "-t", "30", url])

def scrape_list_urls(list_url_out_name, outdir):
    """
    list_url_out_name is a list of tuples: (url, name)
    where name.png will be the image's name
    """
    count = 0
    for url, name in list_url_out_name:
        print count
        count += 1
        outpath = outdir + name + '.png'
        scrape_url(url, outpath)

Answer 3

这里我使用了一个参数来传递.txt的位置，其中包含一个site（newline delimited）列表，以及输出PNG文件位置的第二个参数。

https://gist.github.com/deadstar1/e8d30102afbaefec531d6708f761e104 感谢@paljenczy

如何使用python-webkit2png同时拍摄多个屏幕截图？

3 个答案: