如何验证NSData是否为PDF?

时间:2010-09-29 16:10:01

标签: iphone objective-c pdf encoding nsdata

在处理读取iPhone的应用程序中,该应用程序在UIWebView中显示nsdata(html和pdf)。我在一些PDF验证逻辑中遇到了麻烦。我有一个NSData对象,我知道它包含一个扩展名为.pdf的文件。我想限制无效的PDF进一步获取。这是我第一次尝试验证代码,这似乎适用于大多数情况:

// pdfData is an NSData *
NSData *validPDF = [[NSString stringWithString:@"%PDF"] dataUsingEncoding: NSASCIIStringEncoding];
if (!(pdfData && [[pdfData subdataWithRange:NSMakeRange(0, 4)] isEqualToData:validPDF])) {
    // error
}

不幸的是,几天前上传了一个新的pdf。从某种意义上说,UIWebView会很好地显示它,但它无法通过我的验证测试。我已经将这个问题追溯到一开始就是一堆垃圾字节,%PDF在第14组十六进制字符的中间位置(25或%恰好是第54个字节):

%PDF: 25504446
Breaking PDF: 00010000 00ffffff ff010000 00000000 000f0100 0000b5e0 04000200 01000000 ffffffff 01000000 00000000 0f010000 0099e004 00022550 44462d31 etc...

将NSData验证为PDF的最佳做法是什么? 这个特定的PDF可能有什么问题(它声称它是由PaperPort 11.0编码的,无论是什么)?

谢谢,

麦克

5 个答案:

答案 0 :(得分:4)

这个问题似乎很有帮助:

Detect if PDF file is correct (header PDF)

或者,如果您有冒险精神,here's the spec(来自Adobe网站here

答案 1 :(得分:4)

let fileManager = FileManager()
    let documentsPath = NSSearchPathForDirectoriesInDomains(.documentDirectory, .userDomainMask, true)[0]
    let rootDirectory = "\(documentsPath)/\(caption!)/"
    let imageURL = URL(fileURLWithPath: rootDirectory).appendingPathComponent("0")
    let ns = NSData(contentsOf: imageURL)
    let fileExists = fileManager.fileExists(atPath: imageURL.path)
    var isPDF:Bool = false
    if (ns?.length)! >= 1024 //only check if bigger
    {
        var pdfBytes = [UInt8]()
        pdfBytes = [ 0x25, 0x50, 0x44, 0x46]
        let pdfHeader = NSData(bytes: pdfBytes, length: 4)
        let a = ns?.range(of: pdfHeader as Data, options: .anchored, in: NSMakeRange(0, 1024))
        if (a?.length)! > 0
        {
            isPDF = true


        }
        else
        {
            isPDF = false

        }
    }

答案 2 :(得分:3)

在Swift中我有以下内容:

var isPDF:Bool = false
if assetData.length >= 1024 //only check if bigger
{
    var pdfBytes = [UInt8]()
    pdfBytes = [ 0x25, 0x50, 0x44, 0x46]
    let pdfHeader = NSData(bytes: pdfBytes, length: 4)
    let foundRange = assetData.rangeOfData(pdfHeader, options: nil, range: NSMakeRange(0, 1024))
    if foundRange.length > 0
    {
        isPDF = true
    }
}

答案 3 :(得分:3)

可以试试这个..

    // Validate PDF using NSData
    - (BOOL)isValidePDF:(NSData *)pdfData {
        BOOL isPDF = false;
        if (pdfData.length >= 1024 ) {

            int startMetaCount = 4, endMetaCount = 5;
            // check pdf data is the NSData with embedded %PDF & %%EOF
            NSData *startPDFData = [NSData dataWithBytes:"%PDF" length:startMetaCount];
            NSData *endPDFData = [NSData dataWithBytes:"%%EOF" length:endMetaCount];
            // startPDFData, endPDFData data are the NSData with embedded in pdfData
            NSRange startRange = [pdfData rangeOfData:startPDFData options:0 range:NSMakeRange(0, 1024)];
            NSRange endRange = [pdfData rangeOfData:endPDFData options:0 range:NSMakeRange(0, pdfData.length)];

            if (startRange.location != NSNotFound && startRange.length == startMetaCount && endRange.location != NSNotFound && endRange.length == endMetaCount ) {
                // This assumes the start & end PDFData doesn't have a specific range in file pdf data
                isPDF = true;

            } else  {
                isPDF = false;
            }
        }
        return isPDF;
    }

答案 4 :(得分:1)

Swift 4

extension Data {
    var isPDF: Bool {
        guard self.count >= 1024 else { return false }
        let pdfHeader = Data(bytes: "%PDF", count: 4)
        return self.range(of: pdfHeader, options: [], in: Range(NSRange(location: 0, length: 1024))) != nil
    }
}