Question

我正在使用webkithtmltopdf来生成PDF和＆amp;希望覆盖任何现有文件。

我不确定这是否可以保证原子更新，或者我们有一天会使用不同的PDF工具，所以我将它包装在一些使用临时文件的PHP代码中。创建临时文件后，我使用PHP的rename（）函数覆盖实际文件。

我确保了临时文件＆amp;输出文件存在于同一个分区上，但是当我运行我的脚本时，如果我在被覆盖的确切时刻请求PDF，有时会从浏览器中获得“PDF文件无法显示”类型的消息。

我怎样才能尝试调试？我没有在Apache错误日志中看到任何错误。我在访问日志中看到“200”和“206”请求。我不确定我在内容长度上寻找什么，或者pdf.js如何与服务器一起工作。

代码如下所示：

$output = sprintf(__DIR__."/pdfs/%s.pdf", $id);
$tmpOutput = $output . '.tmp';
$cmd = 'wkhtmltopdf '. escapeshellarg($url) . ' ' . escapeshellarg($tmpOutput);

exec($cmd);
chmod($tmpOutput, 0777);
rename($tmpOutput, $output);
chmod($output, 0777);

值得一提的是，我正在使用chmod解决这个事实，我在一个由管理员运行的齿轮工人中运行它，这是由root启动的。如果这是一个权限问题，我希望在Apache错误日志中出现错误，但不是 - 而且我也期望403状态代码等，但我在日志中看到的只有200或206

Answer 1

Most PDF readers read PDF files from net in chunks, that means in several HTTP requests using the Range header (specifying which byte range from the file it wants, e.g. 1000-5000, so byte count 4000). Webserver replies with HTTP 206 Partial Content response code. If you change the PDF file between these partial requests, PDF reader will receive corrupted file (part from the old file, part from new file).

HTTP protocol should prevent it - On first request PDF reader should receive also ETAg header, which is unique and changes if the file in changed. On subsequent requests the PDF reader should send If-Match header, so that webserver can inform it if the file is still the same. But sometimes this doesn't work. You can disable Range requests within apache configuration (or .htaccess file) with this:

<Files *.pdf>
  Header set Accept-Ranges none
</Files>

Also make sure that your temp file is always unique, so that no 2 PHP processes will write to the same temp filename at the same time.

调试文件没有原子更新（可能的PDF.js浏览器问题）

1 个答案: