在Ruby中读取两个文件并输出结果

时间:2015-11-10 22:38:30

标签: ruby

我有一个name.txt文件和last.txt文件。我想生成所有可能的名字和姓氏的组合。例如:

$cat name.txt
Jack
Jamie
James
Jarred
Josh
John
Jane 


$cat last.txt
doe
smith

我试过这样做:

File.open("name.txt", "r") do |n|


 File.open("last.txt", "r") do |l|
    n.each_line do |first|
       l.each_line do |last|
          full_name = first.chomp + " " + last.chomp
          puts full_name
      end
    end
  end
end

输出仅显示它只处理名称文件的第一行:

Jack doe 
Jack smith

如何让它通过整个第一个文件提供name.txt中所有名称的全名

2 个答案:

答案 0 :(得分:3)

考虑一下:

first = %w[jane john]
last = %w[doe smith]

first.product(last)
# => [["jane", "doe"], ["jane", "smith"], ["john", "doe"], ["john", "smith"]]

您可以这样做:

first = File.readlines('name.txt').map(&:rstrip)
last = File.readlines('last.txt').map(&:rstrip)
first.product(last)

product是Array的方法之一。另请查看permutationcombination

我们可以使用chomp代替rstrip删除由readlines返回的尾随换行符,但chomp只修剪换行符,而如果有任何尾随空格,rstrip将删除尾随空格,稍微清理一下这些名称。 (根据我的经验,我们更有可能在文本之后看到空白,而不是之前的空白,因为它在领先时更容易看到。)

基准:

require 'fruity'

FIRST_NAME = [*'a'..'z']
LAST_NAME  = [*'a'..'z']

FIRST_NAME.size # => 26
LAST_NAME.size  # => 26

def use_product
  FIRST_NAME.product(LAST_NAME) 
end

def use_loops
  output = []
  FIRST_NAME.each do |fn|
    LAST_NAME.each do |ln|
      output << [fn, ln]
    end
  end
  output
end

result = use_product
result.size  # => 676
result.first # => ["a", "a"]
result.last  # => ["z", "z"]

result = use_loops
result.size  # => 676
result.first # => ["a", "a"]
result.last  # => ["z", "z"]

运行它会导致:

compare :use_product, :use_loops
# >> Running each test 64 times. Test will take about 1 second.
# >> use_product is faster than use_loops by 50.0% ± 10.0%

如果源阵列的大小增加:

require 'fruity'

FIRST_NAME = [*'a1'..'z9']
LAST_NAME  = [*'a1'..'z9']

FIRST_NAME.size # => 259
LAST_NAME.size  # => 259

def use_product
  FIRST_NAME.product(LAST_NAME) 
end

def use_loops
  output = []
  FIRST_NAME.each do |fn|
    LAST_NAME.each do |ln|
      output << [fn, ln]
    end
  end
  output
end

result = use_product
result.size  # => 67081
result.first # => ["a1", "a1"]
result.last  # => ["z9", "z9"]

result = use_loops
result.size  # => 67081
result.first # => ["a1", "a1"]
result.last  # => ["z9", "z9"]

运行该返回:

compare :use_product, :use_loops
# >> Running each test once. Test will take about 1 second.
# >> use_product is faster than use_loops by 60.00000000000001% ± 10.0%

虽然我们可以在不利用内置方法的情况下编写算法,但这些方法是用C语言编写的,因此利用它们可以获得更快的速度。

有一段时间我会在内置product上使用单独数组的迭代:如果我有两个巨大的列表,并且由于RAM限制导致可伸缩性问题,将它们拉入内存是禁止的,那么处理它的唯一方法就是嵌套循环。 Ruby's foreach is extremely fast,因此围绕它编写代码将是一个很好的替代方案:

File.foreach('name.txt') do |first|
  File.foreach('last.txt') do |last|
    full_name = first.chomp + " " + last.chomp
    puts full_name
  end
end

答案 1 :(得分:2)

要获取文本文件的每一行,您必须使用each,如下所示:

File.open("name.txt", "r").each do |n|
 . . . 

end

因此,使用each您的代码可以运行:

File.open("name.txt", "r").each do |n|
 File.open("last.txt", "r").each do |l|
    n.each_line do |first|
       l.each_line do |last|
          full_name = first.chomp + " " + last.chomp
          puts full_name
      end
    end
  end
end

虽然 这可以解决您的问题,但它并不是一种有效的文件阅读方式。

为了提高效率,您应该使用readlines一次读取整个文件内容并将其保存在数组中。有关详细信息,请参阅this answer

因此,如果以这种方式编写,您的代码可以更有效:

names = File.readlines('name.txt')
last_names = File.readlines('last.txt')

names.each do |n|
 last_names.each do |l|
    n.each_line do |first|
       l.each_line do |last|
          full_name = first.chomp + " " + last.chomp
          puts full_name
      end
    end
  end
end