通过PDF创建图像(PNG或JPEG)以及图像中文本的HTML图像图?

时间:2019-01-22 17:24:36

标签: html pdf imagemap

我正在记录我维护的系统。本文档包含我在TeX / TikZ中创建的图表,该图表已呈现为PDF文件。然后,我将PDF文件转换为图像文件(通过imagemagick进行PNG),并将其包含在HTML文档中。效果很好。

现在,我想为图像创建一个image map,以便可以添加超链接/鼠标悬停/等。这是我希望根据系统中的更改定期更新的图像,因此,我希望在可能的情况下自动执行此过程。

当PDF文件呈现为PNG时,是否可以使用软件库或工具自动创建PDF文件中各种文本内容的图像映射?

以下是我创建的this gist中的一个示例:

enter image description here

在这种情况下,我想通过在PDF中找到它们的边界框来将各种文本字符串中的一些变成超链接:

  • controller
  • actuator
  • sensor
  • A
  • B
  • C
  • D
  • u
  • y
  • F(s)
  • G(s)
  • H(s)

(它们都是PDF文件中的文本内容;我可以在Acrobat Reader中选择其中任何一个的文本,然后复制+粘贴到我的文本编辑器中。)

有没有办法做到这一点?

2 个答案:

答案 0 :(得分:3)

我能够将以下Python解决方案放在一起,以此作为起点。它将pdf转换为png并输出相应的图像地图标记。

使用输出dpi作为可选参数(默认为200),以便将边界框从默认pdf dpi的72正确缩放到png:

from pdf2image import convert_from_path
from pdfminer.converter import PDFPageAggregator
from pdfminer.layout import LAParams, LTTextBox
from pdfminer.pdfinterp import PDFPageInterpreter
from pdfminer.pdfinterp import PDFResourceManager
from pdfminer.pdfpage import PDFPage

from yattag import Doc, indent

import argparse
import os


def transform_coords(lobj, mb):

    # Transform LTTextBox bounding box to image map area bounding box.
    #
    # The bounding box of each LTTextBox is specified as:
    #
    # x0: the distance from the left of the page to the left edge of the box
    # y0: the distance from the bottom of the page to the lower edge of the box
    # x1: the distance from the left of the page to the right edge of the box
    # y1: the distance from the bottom of the page to the upper edge of the box
    #
    # So the y coordinates start from the bottom of the image. But with image map
    # areas, y coordinates start from the top of the image, so here we subtract
    # the bounding box's y-axis values from the total height.

    return [lobj.x0, mb[3] - lobj.y1, lobj.x1, mb[3] - lobj.y0]


def get_imagemap(d):
    doc, tag, text = Doc().tagtext()
    with tag("map", name="map"):
        for k, v in d.items():
            doc.stag("area", shape="rect", coords=",".join(v), href="", alt=k)
    return indent(doc.getvalue())


def get_bboxes(pdf, dpi):
    fp = open(pdf, "rb")
    rsrcmgr = PDFResourceManager()
    device = PDFPageAggregator(rsrcmgr, laparams=LAParams())
    interpreter = PDFPageInterpreter(rsrcmgr, device)
    page = list(PDFPage.get_pages(fp))[0]

    interpreter.process_page(page)
    layout = device.get_result()

    # PDFminer reports bounding boxes based on a dpi of 72. I could not find a way
    # to change this, so instead I scale each coordinate by multiplying by dpi/72
    scale = dpi / 72.0

    return {
        lobj.get_text().strip(): [
            str(int(x * scale)) for x in transform_coords(lobj, page.mediabox)
        ]
        for lobj in layout
        if isinstance(lobj, LTTextBox)
    }


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("pdf")
    parser.add_argument("--dpi", type=int, default=200)

    args = parser.parse_args()

    page = list(convert_from_path(args.pdf, args.dpi))[0]
    page.save(f"{os.path.splitext(args.pdf)[0]}.png", "PNG")

    print(get_imagemap(get_bboxes(args.pdf, args.dpi)))


if __name__ == "__main__":
    main()

示例结果:

<img src="https://i.stack.imgur.com/aXWMc.png" usemap="#map">
<map name="map">
  <area shape="rect" coords="361,8,380,43" href="#" alt="B" />
  <area shape="rect" coords="434,31,500,64" href="#" alt="G(s)" />
  <area shape="rect" coords="432,93,502,117" href="#" alt="actuator" />
  <area shape="rect" coords="552,8,572,42" href="#" alt="C" />
  <area shape="rect" coords="596,58,609,86" href="#" alt="y" />
  <area shape="rect" coords="105,26,119,40" href="#" alt="+" />
  <area shape="rect" coords="107,54,122,78" href="#" alt="−" />
  <area shape="rect" coords="35,58,51,86" href="#" alt="u" />
  <area shape="rect" coords="164,8,182,43" href="#" alt="A" />
  <area shape="rect" coords="163,152,183,187" href="#" alt="D" />
  <area shape="rect" coords="241,31,311,64" href="#" alt="H(s)" />
  <area shape="rect" coords="236,94,316,118" href="#" alt="controller" />
  <area shape="rect" coords="243,175,309,208" href="#" alt="F (s)" />
  <area shape="rect" coords="247,234,305,258" href="#" alt="sensor" />
</map>

答案 1 :(得分:0)

嗯。我找到了Apache PDFBox库,其中包含一个名为PrintLocations.java的示例,该示例确实可以打印信息,但是我不确定如何解释它,并且它是每个字形的一个位置。

> java -jar print_text_locations.jar blockdiagram_example.pdf
String[37.864998,13.939003 fs=4.9813 xscale=4.9813 height=2.49065 space=2.4906502 width=5.1197815]+
String[59.185997,13.662003 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=6.6450577]A
String[130.229,13.662003 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=6.64505]B
String[198.783,13.498001 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=7.192993]C
String[86.827,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=9.699257]H
String[97.449005,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[102.00201,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.5137405]s
String[107.51601,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536])
String[156.35,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=9.234192]G
String[165.58301,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[170.136,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.513733]s
String[175.65,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536])
String[12.797,29.332 fs=9.9626 xscale=9.9626 height=4.9813 space=4.9813004 width=5.7035875]u
String[38.711,27.432999 fs=4.9813 xscale=4.9813 height=3.4022279 space=2.4906502 width=5.39624]?
String[214.641,29.332 fs=9.9626 xscale=9.9626 height=4.9813 space=4.9813004 width=4.884659]y
String[85.109,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]c
String[88.5959,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[92.473335,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]n
String[96.35077,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387131]t
String[98.28948,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r
String[100.611755,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[104.48919,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.5481873]l
String[106.03738,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.5481873]l
String[107.58556,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]e
String[111.463,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r
String[155.67801,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]a
String[159.55544,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4868927]c
String[163.04233,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387207]t
String[164.98105,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]u
String[168.85847,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]a
String[172.7359,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387207]t
String[174.67462,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]o
String[178.55205,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.322281]r
String[58.912003,65.483 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=7.192993]D
String[87.536,73.099 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=7.577202]F
String[96.740005,73.099 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[101.29201,73.099 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.5137405]s
String[106.80601,73.099 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.5525436])
String[88.983,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]s
String[92.4699,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]e
String[96.347336,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]n
String[100.22477,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]s
String[103.71167,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[107.5891,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r

但是,我确实做了一些小的更改,并且似乎为每个文本项调用了writeString方法,我想我可以找到每个字符串的整体边界矩形:

/**
 * Override the default functionality of PDFTextStripper.
 */
@Override
protected void writeString(String string, List<TextPosition> textPositions) throws IOException
{
    System.out.println("text string: "+string);
    for (TextPosition text : textPositions)
    {
        System.out.println( "String[" + text.getXDirAdj() + "," +
                text.getYDirAdj() + " fs=" + text.getFontSize() + " xscale=" +
                text.getXScale() + " height=" + text.getHeightDir() + " space=" +
                text.getWidthOfSpace() + " width=" +
                text.getWidthDirAdj() + "]" + text.getUnicode() );
    }
}

github gist中pdf文件的输出:

> java -jar pdf2imagemap.jar blockdiagram_example.pdf
text string: +
String[37.864998,13.939003 fs=4.9813 xscale=4.9813 height=2.49065 space=2.4906502 width=5.1197815]+
text string: A
String[59.185997,13.662003 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=6.6450577]A
text string: B
String[130.229,13.662003 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=6.64505]B
text string: C
String[198.783,13.498001 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=7.192993]C
text string: H(s)
String[86.827,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=9.699257]H
String[97.449005,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[102.00201,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.5137405]s
String[107.51601,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536])
text string: G(s)
String[156.35,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=9.234192]G
String[165.58301,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[170.136,21.278 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.513733]s
String[175.65,21.278 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536])
text string: u
String[12.797,29.332 fs=9.9626 xscale=9.9626 height=4.9813 space=4.9813004 width=5.7035875]u
text string: ?
String[38.711,27.432999 fs=4.9813 xscale=4.9813 height=3.4022279 space=2.4906502 width=5.39624]?
text string: y
String[214.641,29.332 fs=9.9626 xscale=9.9626 height=4.9813 space=4.9813004 width=4.884659]y
text string: controller
String[85.109,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]c
String[88.5959,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[92.473335,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]n
String[96.35077,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387131]t
String[98.28948,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r
String[100.611755,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[104.48919,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.5481873]l
String[106.03738,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.5481873]l
String[107.58556,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]e
String[111.463,41.419 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r
text string: actuator
String[155.67801,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]a
String[159.55544,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4868927]c
String[163.04233,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387207]t
String[164.98105,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]u
String[168.85847,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]a
String[172.7359,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=1.9387207]t
String[174.67462,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774261]o
String[178.55205,41.046 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.322281]r
text string: D
String[58.912003,65.483 fs=9.9626 xscale=9.9626 height=6.1668496 space=2.769603 width=7.192993]D
text string: F
String[87.536,73.099 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=7.577202]F
text string: (s)
String[96.740005,73.099 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.552536](
String[101.29201,73.099 fs=11.9552 xscale=11.9552 height=5.9776 space=5.9776006 width=5.5137405]s
String[106.80601,73.099 fs=11.9552 xscale=11.9552 height=5.983578 space=5.9776006 width=4.5525436])
text string: sensor
String[88.983,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]s
String[92.4699,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]e
String[96.347336,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]n
String[100.22477,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.4869003]s
String[103.71167,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=3.8774338]o
String[107.5891,91.978004 fs=6.9738 xscale=6.9738 height=4.3167825 space=1.9387167 width=2.3222733]r