我正试图用Nokogiri迭代XML格式的文件夹结构,但我陷入了困境:
<test>
<folder name="Folder A">
<folder name="Folder A1">
<file name="a.txt">Cool file</file>
</folder>
<folder name="Folder A2"></folder>
</folder>
<folder name="Folder B">
<folder name="Folder B1"></folder>
<folder name="Folder B2">
<folder name="Folder B21">
<file name="b.txt"></file>
</folder>
</folder>
</test>
所以,我想迭代这个以便能够创建文件夹和文件树(文件夹A1和A2在文件夹A内,文件夹B1和B2在文件夹B内,文件夹B21在文件夹B2内) )。
所以我这样做:
nodes = allnodes.xpath('//folder')
nodes.each do |node|
puts "name => #{node.attributes['name']}"
end
但这会列出我所有的文件夹(A,A1,A2,B,B1,B2,B21)。如何才能使我不在前面的文件夹中检查更多文件夹,然后将其发送到相同的递归函数?
非常感谢您的帮助:)
答案 0 :(得分:7)
当您使用带有//foo
的XPath时,您会在任何级别找到foo
个元素。如果您改为使用./foo
或foo
,那么您只会找到子元素。因此:
# Given an XML node, yields the node and all <file> children
# Then recursively does the same with every <folder> child
def process_files_and_folders(node,&blk)
yield node, node.xpath('file')
node.xpath('folder').each{ |folder| process_files_and_folders(folder,&blk) }
end
这个的关键是(a)递归(让所有子文件夹的方法调用本身)和(b)捕获用户使用&blk
表示法传递的块,然后传递该块以后的电话。
见过:
require 'nokogiri'
doc = Nokogiri.XML(my_xml)
process_files_and_folders( doc.root ) do |folder,files|
depth = folder.ancestors.length-1 # Just for pretty text output indenting
indent = " "*depth # Just for pretty text output indenting
if folder['name']
puts "#{indent}Processing the folder named #{folder['name']}"
else
puts "#{indent}No folder name; probably the root element."
end
unless files.empty?
puts "#{indent}There are #{files.length} files in '#{folder['name']}':"
files.each{ |file| print indent, file['name'], "\n" }
end
end
结果:
No folder name; probably the root element.
Processing the folder named Folder A
Processing the folder named Folder A1
There are 1 files in 'Folder A1':
a.txt
Processing the folder named Folder A2
Processing the folder named Folder B
Processing the folder named Folder B1
Processing the folder named Folder B2
Processing the folder named Folder B21
There are 1 files in 'Folder B21':
b.txt
答案 1 :(得分:2)
我会这样做:
require 'nokogiri'
doc = Nokogiri::XML(<<-xml)
<test>
<folder name="Folder A">
<folder name="Folder A1">
<file name="a.txt">Cool file</file>
</folder>
<folder name="Folder A2"></folder>
</folder>
<folder name="Folder B">
<folder name="Folder B1"></folder>
<folder name="Folder B2">
<folder name="Folder B21">
<file name="b.txt"></file>
</folder>
</folder>
</test>
xml
# Here I am collecting all folders, which has at-least one child.
parent_folders = doc.xpath("//folder").select do|folder_node|
folder_node.xpath("./folder").size > 0
end
# Here I will iterate each parent directory, and would collect the corresponding
# sub-directories names.
parent_directory = parent_folders.each_with_object({}) do |parent_dir,dir_hash|
dir_hash[parent_dir['name']] = parent_dir.xpath("./folder").collect do |sub_dir|
sub_dir['name']
end
end
parent_directory
# => {"Folder A"=>["Folder A1", "Folder A2"],
# "Folder B"=>["Folder B1", "Folder B2", "Folder B21"],
# "Folder B2"=>["Folder B21"]}
现在,您有一个哈希parent_directory
,它维护所有目录(键)/子目录(值)关系。现在使用Hash#[]
方法,您可以轻松提取给定目录的子目录。一个例子 -
parent_directory['Folder A'] # => ["Folder A1", "Folder A2"]
答案 2 :(得分:0)
有点不清楚你要做什么,但是假设你正在Linux系统上的磁盘上创建一个新的目录结构。
doc.xpath("//folder[not(folder)]").each do |f|
path = f.xpath("ancestor-or-self::folder").map{|f| f['name']}.join("/")
system("mkdir -p #{path}")
end
这就是它的作用:
答案 3 :(得分:0)
所以,我后来发现了如何解决它。
为了澄清,我打算有这样的功能:
def create_structure(nodeset, current_folder)
new_folder = "#{current_folder }/#{nodeset.attributes['name']"
Dir.makedir(new_folder)
create_files_in_current_folder(nodeset, new_folder)
subnodeset = nodeset.xpath('/folder')
subnodeset.each do |node|
create_structure(node, new_folder)
end
end
这样我就可以将xml中的结构复制到文件系统中。
所以,至于解决方案,它就在我眼前。我不能使用“//文件夹”而是“/ folder”,因为第一个文件夹将返回所有文件夹,无论它们在xml结构中的位置如何,第二个文件只返回根目录中的文件夹。当前节点。
我希望这有助于并感谢每个人的答案。我会尽快尝试。