使用ruby将上传到数据库中的文件转换为xml文件中的cdata

时间:2018-02-07 20:56:29

标签: ruby xml paperclip nokogiri cdata

我将文件存储在document table paperclip的文件中。我的文件为word,csv,xlsxpdf。 我想从数据库中读取这些记录并将它们转换为cdata并将它们放在名为Attachment的xml标记中。

这是我期待的结果:

<Attachment ContentType="xlsx" Extension="xlsx" Description="TEST2.xlsx"><![CDATA[UEsDBBQABgAIAAAAIQBxDjkrcAEAAKAFAAATANsBW0NvbnRlbnRfVHlwZXNdLnhtbCCi1wEooAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA..

我查询时我的文件是这样的:

#<Paperclip::Attachment:0x000000049dfe28 @name=:picture, @name_string="picture", @instance=#<Document id: 225, picture_file_name: "data", picture_content_type: "application/vnd.openxmlformats-officedocument.word...", picture_file_size: 10001, picture_updated_at: "2018-02-07 20:14:18", ece_id: 242, created_at: "2018-02-07 20:14:18", updated_at: "2018-02-07 20:14:18", xml_file_name: nil, xml_content_type: nil, xml_file_size: nil, xml_updated_at: nil, whodunnit: nil, document_type: "Attachment">, @options={:convert_options=>{}, :default_style=>:original, :default_url=>"/:attachment/:style/missing.png", :escape_url=>true, :restricted_characters=>/[&$+,\/:;=?@<>\[\]\{\}\|\\\^~%# ]/, :filename_cleaner=>nil, :hash_data=>":class/:attachment/:id/:style/:updated_at", :hash_digest=>"SHA1", :interpolator=>Paperclip::Interpolations, :only_process=>[], :path=>":rails_root/public:url", :preserve_files=>false, :processors=>[:thumbnail], :source_file_options=>{}, :storage=>:filesystem, :styles=>{}, :url=>"/system/:class/:attachment/:id_partition/:style/:filename", :url_generator=>Paperclip::UrlGenerator, :use_default_time_zone=>true, :use_timestamp=>true, :whiny=>true, :validate_media_type=>true, :check_validity_before_processing=>true}, @post_processing=true, @queued_for_delete=[], @queued_for_write={}, @errors={}, @dirty=false, @interpolator=Paperclip::Interpolations, @url_generator=#<Paperclip::UrlGenerator:0x000000049dfd10 @attachment=#<Paperclip::Attachment:0x000000049dfe28 ...>>, @source_file_options={}, @whiny=true> 

我使用nokogiri生成xml标记。

你有什么想法吗?提前致谢

1 个答案:

答案 0 :(得分:1)

您可以使用Paperclip读取文件内容,如下所示:

file_content = Paperclip.io_adapters.for(attachment.file).read

有很多方法可以使用Nokogiri将CDATA XML标记写入XML文档,具体取决于您构建文档的方式,这里有来自cheat sheet的文档:

doc.create_cdata(file_content)

但是,您应该使用base64编码对文件内容进行编码,因为文件内容是二进制文件,并且可能包含XML文档中不允许的字符。

所以这是:

require 'base64' # if not already required
Base64.encode64(file_content)

总之,这里是一个伪代码段:

file_content = Paperclip.io_adapters.for(attachment.file).read
doc.create_cdata(Base64.encode64(file_content))

您还应该考虑使用Base64元素代替CDATA,结帐thisthis答案(以及其他相同问题)。