Question

在Python中，有没有办法自动检测PDF的某个区域中的颜色，并将它们转换为RGB或将它们与图例进行比较，然后获得颜色？

Answer 1

根据您要从中提取信息的位置，您可以使用minecart。它具有非常强大的颜色支持，可以轻松转换为RGB。虽然您无法输入坐标并获取颜色值，但如果您尝试从形状中获取颜色信息，则可以执行以下操作：

import minecart
doc = minecart.Document(open("my-doc.pdf", "rb"))
page = doc.get_page(0)
BOX = (.5 * 72,  # left bounding box edge
       9 * 72,   # bottom bounding box edge
       1 * 72,   # right bounding box edge
       10 * 72)  # top bounding box edge
for shape in page.shapes:
    if shape.check_in_bbox(BOX):
        r, g, b = shape.fill.color.as_rgb()
        # do stuff with r, g, b

[免责声明：我是minecart]

的作者

Answer 2

Felipe的方法对我没有用，但我想出了这个：

#!/usr/bin/env python
# -*- Encoding: UTF-8 -*-

import minecart

colors = set()

with open("file.pdf", "rb") as file:
    document = minecart.Document(file)
    page = document.get_page(0)
    for shape in page.shapes:
        if shape.fill:
            colors.add(shape.fill.color.as_rgb())

for color in colors: print color

这将打印文档第一页中所有唯一RGB值的整齐列表（当然，您可以将其扩展到所有页面）。

如何从PDF Python中检测颜色

2 个答案: