与此相关:How do I know if PDF pages are color or black-and-white?
我需要知道当前页面是使用java的彩色还是黑色和白色。
我尝试使用PDFBox,执行以下操作:
public void checkColor(final File pdfFile) {
PDDocument document;
try {
document = PDDocument.load(pdfFile);
List<PDPage> pages = document.getDocumentCatalog().getAllPages();
for (int i = 0; i < pages.size(); i++) {
System.out.println();
PDPage page = pages.get(i);
//BufferedImage image = page.convertToImage();
BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 72);
parseColor(image, i);
}
printPages();
} catch (IOException ex) {
Logger.getLogger(PdfBoxParser.class.getName()).log(Level.SEVERE, null, ex);
}
}
public static boolean isColorPixel(final int pixel) {
//took from some post from stackoverflow
System.out.print(pixel);
System.out.print(",");
int alpha = (pixel >> 24) & 0xff;
int red = (pixel >> 16) & 0xff;
int green = (pixel >> 8) & 0xff;
int blue = (pixel) & 0xff;
// gray: R = G = B
boolean gray = (red == green && green == blue);
return gray;
}
protected void parseColor(BufferedImage pImage, int pPageNumber) {
int thresholdColor = Main.COLOR_THRESHOLD_PER_PAGE;
for (int h = 0; h < pImage.getHeight(); h++) {
for (int w = 0; w < pImage.getWidth(); w++) {
int pixel = pImage.getRGB(w,h);
boolean color = Main.isColorPixel(pixel);
if (color) {
thresholdColor--;
if (thresholdColor == 0) {
//do something like store this page number...
.
.
.
问题是,我尝试了各种PDF(电子书,单页pdf等),每个“最终int像素”返回“-1”,还有一堆警告(org.apache.pdfbox.util.PDFStreamEngine processOperator不支持/禁用的操作:i / EMC / BMC / ri)。这可以解决吗?