Question

所以，我是PDFBox的新手，我正在编写一个使用此库将图像添加到PDF文档现有页面上的特定坐标的类。

到目前为止，一切都很顺利，但有一件事让我担心。

PDDocument doc = PDDocument.load(pdfFile);
List pages = doc.getDocumentCatalog().getAllPages();
PDPage page = (PDPage) pages.get(pageNumber);

这基本上就是我如何获取要添加图像的特定页面。我担心的是getAllPages（）方法的文档表明它返回PDPage和PDPageNode对象。到目前为止，在我的测试中，我似乎只获得了PDPage，所以我一直很好，但我不希望有一天出现PDPageNode并使用ClassCastException破坏我的代码。

那么，这两个班级有什么区别，我怎么能避免我担心的呢？

Answer 1

别担心，您将获得PDPage对象（在1.8.10中）。 javadoc清楚地告诉“此方法将返回此文档中所有PDPage对象的平面列表”，PDPageNode中的源代码确认了这一点：

    for( int i=0; i<kids.size(); i++ )
    {
        // ignore duplicates (from malformed PDFs)
        if (!seen.contains(kids.get(i)))
        {
            COSBase obj = kids.getObject( i );
            if (obj instanceof COSDictionary)
            {
                COSDictionary kid = (COSDictionary)obj;
                if( COSName.PAGE.equals( kid.getDictionaryObject( COSName.TYPE ) ) )
                {
                    result.add( new PDPage( kid ) );
                }
                else
                {
                    if (recurse)
                    {
                        getAllKids(result, kid, recurse);
                    }
                    else
                    {
                        result.add( new PDPageNode( kid ) );
                    }
                }
            }
            seen.add(kids.get(i));
        }
    }

如果您刚开始使用PDFBox，我建议您使用2.0，API更容易：

/**
 * Returns the page at the given index.
 *
 * @param pageIndex the page index
 * @return the page at the given index.
 */
public PDPage getPage(int pageIndex)

您不需要将PDPageNode类作为普通用户。使用该类是因为某些节点不是叶子，即页面对象在PDF中以树形式组织而不是列表。但是getAllPages（）会将这些作为列表提供给你。

PDFBox：PDPage和PDPageNode之间的区别

1 个答案: