我尝试阅读pdf文件。我想使用这个功能
databuffer = [file readDataOfLength : 7] but i want read all the byte in the line .
这意味着我使用seekToFileOffset
来查找字节,但我想读取该行。
NSMutableArray *nameArray = [[NSMutableArray alloc] initWithObjects:nil];
NSMutableArray *nameArrayDict = [[NSMutableArray alloc] initWithObjects:nil];
NSString *path = [[NSBundle mainBundle] pathForResource:@"testpdf" ofType:@"pdf"];
NSString *contents = [NSString stringWithContentsOfFile:path encoding:NSASCIIStringEncoding error:nil];
int var=[[nameArray objectAtIndex:[nameArray count]-2] intValue];
NSFileHandle *file;
NSData *databuffer;
file = [NSFileHandle fileHandleForReadingAtPath: appFile];
int i=0;
while (file!=nil ) {
[file seekToFileOffset: var+i];
databuffer = [file readDataOfLength : 7];
NSString* aStr;
aStr = [[NSString alloc] initWithData: databuffer encoding:NSASCIIStringEncoding];
NSLog(@"%@",aStr);
i=i+[databuffer length];
}
现在我尝试你的解决方案,但我什么都不能显示!!!
CGPDFPageRef page = CGPDFDocumentGetPage (myDocument, 1);// 2
CGPDFDictionaryRef d;
d = CGPDFPageGetDictionary(page);
CGPDFScannerRef myScanner;
CGPDFOperatorTableRef myTable;
myTable = CGPDFOperatorTableCreate();
CGPDFContentStreamRef myContentStream = CGPDFContentStreamCreateWithPage (page);// 3
myScanner = CGPDFScannerCreate (myContentStream, myTable, NULL);// 4
CGPDFScannerScan (myScanner);// 5
CGPDFOperatorTableSetCallback(myTable, "BT", &op_BT);//Begin text object
CGPDFOperatorTableSetCallback(myTable, "ET", &op_ET);//End text object
CGPDFOperatorTableSetCallback (myTable, "MP", &op_MP);
CGPDFOperatorTableSetCallback (myTable, "DP", &op_DP);
CGPDFOperatorTableSetCallback (myTable, "BMC", &op_BMC);
CGPDFOperatorTableSetCallback (myTable, "BDC", &op_BDC);
CGPDFOperatorTableSetCallback (myTable, "EMC", &op_EMC);
答案 0 :(得分:0)
我建议不要以您喜欢的方式解析您的PDF格式。
尝试使用CGPDFScanner
(docs here)
正如JeremyP所说:很多zlib压缩的东西都在PDF文件中。搜索行尾。 使用CGPDFScanner提取字体映射,图像等。
但这并不容易。我可以告诉你。 :)