I have XML data extracted from a legacy Lotus Notes application. The XML has embedded binary data. I am guessing, based on information on the IBM Lotus Notes website, that it is encoded in base64 format, but I am not certain of this. Some of the binary data appears to be images, while some of it appears to be embedded MS Word documents. I am using the Saxon XSLT processor. How can I decode this binary data using XSLT?
The data looks roughly like this:
<objectref version='2' name='EXT12682' class='Word.Document.8'
displayformat='metafile' description='Microsoft Word Document' classid='{00020906-0000-0000-c000-000000000046}'
storageformat='structstorage'><picture height='289px' width='625px' scaledheight='3.0104in'
scaledwidth='6.5104in'><notesbitmap>illegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygook</notesbitmap></picture></objectref>
<file hosttype='bytearraypage'
compression='none' flags='storedindoc' name='STG12172'>
<created><datetime dst='true'>20080924T171730,05-04</datetime></created>
<modified><datetime dst='true'>20080924T171730,05-04</datetime></modified><filedata>illegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygookillegiblegobbledygook</filedata></file>
答案 0 :(得分:0)
Saxon(PE和EE)的最新版本包括EXPath二进制模块(http://expath.org/spec/binary)的实现,其中包含操作二进制数据所需的一切 - 当然除了您想要的二进制数据的规范操纵。如果您知道输入结构是什么,并且如果您知道要生成的输出应该是什么样的,那么二进制函数应该对您有帮助,但我担心您的问题也不是您真正知道的。
如果您认为二进制数据是例如base64编码的JPEG文件,那么您实际上并不需要EXPath二进制模块 - EXPath文件模块(也在Saxon PE和EE中实现)应该足够。见http://expath.org/spec/file#fn.write-binary
你可以这样做:
file:write-binary("output.jpeg", xs:base64Binary(jpegBitMap))
将二进制元素的内容写为外部文件,然后您可以尝试使用了解相关格式的应用程序打开该文件。
(请注意这些方法,因为它们有副作用,不适合XQuery或XSLT。例如,不要尝试在变量初始化程序中调用它们,如果变量不会被调用从未使用过。)