Question

我在文件系统（Linux）中读取了一对对... UniqueDocument.xml UniqueDocument.pdf

我需要找到没有xml文件的条目，然后我需要获取它。

一直在尝试使用os.list和regex，但还没有在Ruby中找到一个可用的解决方案和Dir（）。但我无法走到尽头......我的思绪阻止了我。

Answer 1

在Ruby中，

# Get an array of file names for pdf and xml
pdf=Dir.glob("test/*.pdf").map {|f| File.basename(f, '.pdf')}
xml=Dir.glob("test/*.xml").map {|f| File.basename(f, '.xml')}

# Make the difference between xml and pdf to get file names that have a pdf file but no xml
p pdf - xml

它是如何运作的？

Dir.glob("test/*.pdf")

返回一个数组，其中包含文件夹test中所有pdf文件的路径。看起来像["test/foo.pdf", ...]。

File.basename('test/foo.pdf', '.pdf')

返回没有扩展名的文件名。在这种情况下，将返回'foo'。

Dir.glob("test/*.pdf").map {|f| File.basename(f, '.pdf')}

返回一个没有扩展名的文件名数组，只带有pdf文件。

pdf - xml

返回pdf中但不包含在xml中的所有字符串。

对列表，找到那些不是

1 个答案: