文档对我来说并不是很清楚。到目前为止,我认为我需要设置一个CGPDFOperatorTable,然后为每个PDF页面创建一个CGPDFContentStreamCreateWithPage和CGPDFScannerCreate。
文档是指设置回调,但我不清楚如何。如何从页面实际获取内容?
到目前为止,这是我的代码。
let pdfURL = NSBundle.mainBundle().URLForResource("titleofdocument", withExtension: "pdf")
// Create pdf document
let pdfDoc = CGPDFDocumentCreateWithURL(pdfURL)
// Nr of pages in this PF
let numberOfPages = CGPDFDocumentGetNumberOfPages(pdfDoc) as Int
if numberOfPages <= 0 {
// The number of pages is zero
return
}
let myTable = CGPDFOperatorTableCreate()
// lets go through every page
for pageNr in 1...numberOfPages {
let thisPage = CGPDFDocumentGetPage(pdfDoc, pageNr)
let myContentStream = CGPDFContentStreamCreateWithPage(thisPage)
let myScanner = CGPDFScannerCreate(myContentStream, myTable, nil)
CGPDFScannerScan(myScanner)
// Search for Content here?
// ??
CGPDFScannerRelease(myScanner)
CGPDFContentStreamRelease(myContentStream)
}
// Release Table
CGPDFOperatorTableRelease(myTable)
这是一个类似的问题:PDF Parsing with SWIFT但还没有答案。
答案 0 :(得分:4)
以下是Swift中实现的回调示例:
let operatorTableRef = CGPDFOperatorTableCreate()
CGPDFOperatorTableSetCallback(operatorTableRef, "BT") { (scanner, info) in
print("Begin text object")
}
CGPDFOperatorTableSetCallback(operatorTableRef, "ET") { (scanner, info) in
print("End text object")
}
CGPDFOperatorTableSetCallback(operatorTableRef, "Tf") { (scanner, info) in
print("Select font")
}
CGPDFOperatorTableSetCallback(operatorTableRef, "Tj") { (scanner, info) in
print("Show text")
}
CGPDFOperatorTableSetCallback(operatorTableRef, "TJ") { (scanner, info) in
print("Show text, allowing individual glyph positioning")
}
let numPages = CGPDFDocumentGetNumberOfPages(pdfDocument)
for pageNum in 1...numPages {
let page = CGPDFDocumentGetPage(pdfDocument, pageNum)
let stream = CGPDFContentStreamCreateWithPage(page)
let scanner = CGPDFScannerCreate(stream, operatorTableRef, nil)
CGPDFScannerScan(scanner)
CGPDFScannerRelease(scanner)
CGPDFContentStreamRelease(stream)
}
答案 1 :(得分:1)
您实际上确切地指明了如何操作,您需要做的就是将它放在一起并尝试直到它工作。
首先,你需要设置一个带回调的表,当你在问题的开头陈述自己时(Objective C中的所有代码,不是Swift):
CGPDFOperatorTableRef operatorTable = CGPDFOperatorTableCreate();
CGPDFOperatorTableSetCallback(operatorTable, "q", &op_q);
CGPDFOperatorTableSetCallback(operatorTable, "Q", &op_Q);
此表包含您要调用的PDF运算符列表,并将回调与它们相关联。那些回调只是你在其他地方定义的函数:
static void op_q(CGPDFScannerRef s, void *info) {
// Do whatever you have to do in here
// info is whatever you passed to CGPDFScannerCreate
}
static void op_Q(CGPDFScannerRef s, void *info) {
// Do whatever you have to do in here
// info is whatever you passed to CGPDFScannerCreate
}
然后你创建扫描仪并开始运行,同时传递你刚刚定义的信息。
// Passing "self" is just an example, you can pass whatever you want and it will be provided to your callback whenever it is called by the scanner.
CGPDFScannerRef contentStreamScanner = CGPDFScannerCreate(contentStream, operatorTable, self);
CGPDFScannerScan(contentStreamScanner);
如果您想查看有关如何查找和处理图像的源代码的完整示例,请check this website。
答案 2 :(得分:-1)