XSLT,RUBY,如何从根目录输出下一个元素名称?

时间:2019-06-19 19:47:11

标签: ruby xml csv xslt nokogiri

我正在处理一个涉及XSLT的ruby脚本,以将XML转换为CSV。我的代码的逻辑之一是动态获取根之后的父节点元素,以便将其视为CSV文件中的记录行。通过使用Oxygen转换XML可以得到想要的东西,但是通过使用Nokogiri却出现了此错误:

  

/Library/Ruby/Gems/2.3.0/gems/nokogiri-1.10.3/lib/nokogiri/xslt.rb:32:in parse_stylesheet_doc': compilation error: file selectXMLelement.xsl line 5 element stylesheet (RuntimeError) xsl:version: only 1.1 features are supported compilation error: file selectXMLelement.xsl line 8 element value-of xsl:value-of : could not compile select expression 'concat(':',/data:root/*/local-name())' from /Library/Ruby/Gems/2.3.0/gems/nokogiri-1.10.3/lib/nokogiri/xslt.rb:32:in parse'       来自/Library/Ruby/Gems/2.3.0/gems/nokogiri-1.10.3/lib/nokogiri/xslt.rb:13:in XSLT' from EXTC-v1.rb:37:in api_component'       来自EXTC-v1.rb:43:在block in <main>' from EXTC-v1.rb:43:in中每个       来自EXTC-v1.rb:43:in''

我想知道是否有一种方法可以使用Nokogiri而不是XSLT来获取我想要的东西,以及如何输入我的Ruby脚本逻辑。

我尝试使用此XSLT:

 <?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:data="urn:com.sample/bsvc"
    exclude-result-prefixes="data"
    version="2.0">
    <xsl:output method="text"/>
    <xsl:template match="/">
            <xsl:value-of select="concat(':',/data:root/*[1]/local-name())"/>
    </xsl:template>
</xsl:stylesheet>

示例XML,并且我成功使用Oxygen从XSLT“:Data_Request”中输出了我想要的内容

<data:root>
    <data:Data_Request>
        <data:name>John Doe</data:name>
        <data:phone>123456776</data:phone>
    </data:Data_Request>
</data:root>

我的Ruby脚本:

def xslt_transform(filename)
  #dir = File.join(Dir.pwd,'/input/')
  xml_str = File.read(filename)
  doc = Nokogiri::XML xml_str
  template = Nokogiri::XSLT(File.open('Remove-CDATA.xsl'))
  transformed_doc = template.transform(doc)

  File.write(filename, transformed_doc)
end

Dir.glob('*.xml').each {|filename| xslt_transform(filename)}
#this is where iam trying to use the XSLT
def api_component(filename)
  xml_str = File.read(filename)
  doc = Nokogiri::XML xml_str
  template = Nokogiri::XSLT(File.open('selectXMLelement.xsl'))
  transformed_doc = template.transform(doc)

  puts filename
end

api_name = Dir.glob('*xml').each {|filename| api_component(filename)}

puts api_name


def xml_to_csv(filename)
  dir = File.join(Dir.pwd,'/input/')
  xml_str = File.read(filename)
  doc     = Nokogiri::XML xml_str
  csv_filename = filename.gsub('.xml','.csv')
  record  = {} # hashes
  keys    = Set.new
  records = [] # array
  csv     = ""

# Returns a new hash created by traversing the hash and its subhashes, 
# executing the given block on the key and value. The block should return a 2-element array of the form [key, value].
  doc.traverse do |node| 
    value = node.text.gsub(/\n +/, '')
      if node.name != "text" # skip these nodes: if class isnt text then skip
        if value.length > 0 # skip empty nodes
          key = node.name.gsub(/wd:/,'').to_sym
          #api_component = doc.xpath('/*/*[1]')
          # if a new and not empty record, add to our records collection
          if key == :Data_Request && !record.empty? #for regular XML parsng, use the request data. For example :Location_Data
            records << record
            record = {}
          elsif key[/^root$|^document$/]
            # neglect these keys
          else
            key = node.name.gsub(/data:/,'').to_sym
            # in case our value is html instead of text
            record[key] = Nokogiri::HTML.parse(value).text
            # add to our key set only if not already in the set
            keys << key
          end
        end
      end
    end



# build our csv
  dir = File.join(Dir.pwd,'/output/')
  File.open('../output/'+csv_filename, 'wb') do |file|
    file.puts %Q{"#{keys.to_a.join('","')}"}
    records.each do |record|
      keys.each do |key|
        file.write %Q{"#{record[key]}",}
      end
      file.write "\n"
    end
    print ''
    print filename+ " is ready!\n"
    print ''
  end
end

Dir.glob('*.xml').each { |filename| xml_to_csv(filename) }

如您所见,现在我已经对其节点元素进行了硬编码:if key == :Data_Request && !record.empty?

能用Nokogiri做到吗?并且可以动态检测读取路径中的所有XML文件?如果没有,如何通过脚本中嵌入的XSLT实现呢? 边问!有没有办法用我的脚本将所有数据格式都设置为文本格式,以便可以保留前导零? :)

0 个答案:

没有答案