How to decode binary data embedded in XML using XSLT?

时间:2016-10-20 19:16:11

标签: xml xslt lotus-notes saxon

I have XML data extracted from a legacy Lotus Notes application. The XML has embedded binary data. I am guessing, based on information on the IBM Lotus Notes website, that it is encoded in base64 format, but I am not certain of this. Some of the binary data appears to be images, while some of it appears to be embedded MS Word documents. I am using the Saxon XSLT processor. How can I decode this binary data using XSLT?

The data looks roughly like this:

<objectref version='2' name='EXT12682' class='Word.Document.8'
    displayformat='metafile' description='Microsoft Word Document' classid='{00020906-0000-0000-c000-000000000046}'
    storageformat='structstorage'><picture height='289px' width='625px' scaledheight='3.0104in'
        scaledwidth='6.5104in'><notesbitmap>illegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygook</notesbitmap></picture></objectref>


<file hosttype='bytearraypage'
    compression='none' flags='storedindoc' name='STG12172'>
    <created><datetime dst='true'>20080924T171730,05-04</datetime></created>
    <modified><datetime dst='true'>20080924T171730,05-04</datetime></modified><filedata>illegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygook</filedata></file>

1 个答案:

答案 0 :(得分:0)

Saxon(PE和EE)的最新版本包括EXPath二进制模块(http://expath.org/spec/binary)的实现,其中包含操作二进制数据所需的一切 - 当然除了您想要的二进制数据的规范操纵。如果您知道输入结构是什么,并且如果您知道要生成的输出应该是什么样的,那么二进制函数应该对您有帮助,但我担心您的问题也不是您真正知道的。

如果您认为二进制数据是例如base64编码的JPEG文件,那么您实际上并不需要EXPath二进制模块 - EXPath文件模块(也在Saxon PE和EE中实现)应该足够。见http://expath.org/spec/file#fn.write-binary

你可以这样做:

file:write-binary("output.jpeg", xs:base64Binary(jpegBitMap))

将二进制元素的内容写为外部文件,然后您可以尝试使用了解相关格式的应用程序打开该文件。

(请注意这些方法,因为它们有副作用,不适合XQuery或XSLT。例如,不要尝试在变量初始化程序中调用它们,如果变量不会被调用从未使用过。)