PDF manipulation - images are distorted after few consecutive operations on PDF file

时间:2017-07-12 08:19:36

标签: php pdf encryption ms-word mpdf

I've run into this weird issue with PDF file handling. Not sure if SO is the right place to ask this, but I couldn't find any specific sites for this. I hope that someone can shed some light on the issue.

This happens with the following specific process, if some of steps are omitted - the issue is not observed.

I have a PHP application that serves PDF files to users. These files are created by authors in MS Word 2007, then printed to protected PDF (using pdf995, most likely, I can confirm if needed). I'll call this initial PDF file as 'source' hereinafter.

Upon request, the source file is processed in PHP the following way:

we decrypt it using qpdf:

qpdf --decrypt "source.pdf" "tmp_output.pdf"

Then we add security label / wartermark to it, encrypt and output to browser using mPDF 6.0:

$mpdf = new mPDF();
$mpdf->SetImportUse();

$pagecount = $mpdf->SetSourceFile($fpath);
if ($pagecount) {
    for ($i=1;$i<=$pagecount;$i++){
    $tplId = $mpdf->ImportPage($i);
    $mpdf->UseTemplate($tplId);

    $html = '[security label / watermark contents...]';

    $mpdf->WriteHTML($html);
   }
}

$mpdf->SetProtection(array('copy','print'), '', 'password',128);

$mpdf->Output('final_output.pdf','I');

With the exact steps described above, images in the output that were pasted in the Word doc appear as follows:

enter image description here

In the source PDF, tmp_output (qpdf decrypted file) the pasted images look correct:

enter image description here

The distortion doesn't take place if any of the following occurs:

  • Word doc printed to PDF without protection
  • mPDF output is not protected.

As you can see there too many factors, so I don't know where to look for a bug. Each component works correctly on it's own and I cannot find any info on the issue. Any insights are greatly appreciated.

EDIT 1

After some more testing, it appears that this only happens to screenshots taken from web browser, Windows explorer, MS Word. Cannot reproduce this with screenshots from Gimp.

It appears that something along the way attempts to convert white to alpha and fails.

1 个答案:

答案 0 :(得分:1)

Mpdf的当前版本(6.1)有一个错误,如果它们应该被加密,则不能处理转义的PDF字符串(通过FPDI导入)。

可以使用解决此问题的拉取请求here