我有一个生成模拟打字的程序。该程序获取用户输入文件位置的位置以及文件和扩展名。然后使用迭代将文件分解并将其放入数组中。
def file_to_array(file)
empty = []
File.foreach("#{file}") do |line|
empty << line.to_s.split('')
end
return empty.flatten!
end
当程序运行时,它会将密钥发送到文本区域以模拟通过win32ole
进行的输入。
在5,000个字符之后,内存开销太大,程序开始变慢。过去5,000个字符越慢。有没有办法可以优化它?
- 编辑 -
require 'Benchmark'
def file_to_array(file)
empty = []
File.foreach(file) do |line|
empty << line.to_s.split('')
end
return empty.flatten!
end
def file_to_array_2(file)
File.read(file).split('')
end
file = 'xxx'
Benchmark.bm do |results|
results.report { print file_to_array(file) }
results.report { print file_to_array_2(file) }
end
user system total real
0.234000 0.000000 0.234000 ( 0.787020)
0.218000 0.000000 0.218000 ( 1.917185)
答案 0 :(得分:2)
我做了我的基准测试和个人资料,这里是代码:
#!/usr/bin/env ruby
require 'benchmark'
require 'rubygems'
require 'ruby-prof'
def ftoa_1(path)
empty = []
File.foreach(path) do |line|
empty << line.to_s.split('')
end
return empty.flatten!
end
def ftoa_2(path)
File.read(path).split('')
end
def ftoa_3(path)
File.read(path).chars
end
def ftoa_4(path)
File.open(path) { |f| f.each_char.to_a }
end
GC.start
GC.disable
Benchmark.bm(6) do |x|
1.upto(4) do |n|
x.report("ftoa_#{n}") {send("ftoa_#{n}", ARGV[0])}
end
end
1.upto(4) do |n|
puts "\nProfiling ftoa_#{n} ...\n"
result = RubyProf.profile do
send("ftoa_#{n}", ARGV[0])
end
RubyProf::FlatPrinter.new(result).print($stdout)
end
这是我的结果:
user system total real
ftoa_1 2.090000 0.160000 2.250000 ( 2.250350)
ftoa_2 1.540000 0.090000 1.630000 ( 1.632173)
ftoa_3 0.420000 0.080000 0.500000 ( 0.505286)
ftoa_4 0.550000 0.090000 0.640000 ( 0.630003)
Profiling ftoa_1 ...
Measure Mode: wall_time
Thread ID: 70190654290440
Fiber ID: 70189795562220
Total: 2.571306
Sort by: self_time
%self total self wait child calls name
83.39 2.144 2.144 0.000 0.000 103930 String#split
12.52 0.322 0.322 0.000 0.000 1 Array#flatten!
3.52 2.249 0.090 0.000 2.159 1 <Class::IO>#foreach
0.57 0.015 0.015 0.000 0.000 103930 String#to_s
0.00 2.571 0.000 0.000 2.571 1 Global#[No method]
0.00 2.571 0.000 0.000 2.571 1 Object#ftoa_1
0.00 0.000 0.000 0.000 0.000 1 Fixnum#to_s
* indicates recursively called methods
Profiling ftoa_2 ...
Measure Mode: wall_time
Thread ID: 70190654290440
Fiber ID: 70189795562220
Total: 1.855242
Sort by: self_time
%self total self wait child calls name
99.77 1.851 1.851 0.000 0.000 1 String#split
0.23 0.004 0.004 0.000 0.000 1 <Class::IO>#read
0.00 1.855 0.000 0.000 1.855 1 Global#[No method]
0.00 1.855 0.000 0.000 1.855 1 Object#ftoa_2
0.00 0.000 0.000 0.000 0.000 1 Fixnum#to_s
* indicates recursively called methods
Profiling ftoa_3 ...
Measure Mode: wall_time
Thread ID: 70190654290440
Fiber ID: 70189795562220
Total: 0.721246
Sort by: self_time
%self total self wait child calls name
99.42 0.717 0.717 0.000 0.000 1 String#chars
0.58 0.004 0.004 0.000 0.000 1 <Class::IO>#read
0.00 0.721 0.000 0.000 0.721 1 Object#ftoa_3
0.00 0.721 0.000 0.000 0.721 1 Global#[No method]
0.00 0.000 0.000 0.000 0.000 1 Fixnum#to_s
* indicates recursively called methods
Profiling ftoa_4 ...
Measure Mode: wall_time
Thread ID: 70190654290440
Fiber ID: 70189795562220
Total: 0.816140
Sort by: self_time
%self total self wait child calls name
99.99 0.816 0.816 0.000 0.000 2 IO#each_char
0.00 0.000 0.000 0.000 0.000 1 File#initialize
0.00 0.000 0.000 0.000 0.000 1 IO#close
0.00 0.816 0.000 0.000 0.816 1 <Class::IO>#open
0.00 0.000 0.000 0.000 0.000 1 IO#closed?
0.00 0.816 0.000 0.000 0.816 1 Global#[No method]
0.00 0.816 0.000 0.000 0.816 1 Enumerable#to_a
0.00 0.816 0.000 0.000 0.816 1 Enumerator#each
0.00 0.816 0.000 0.000 0.816 1 Object#ftoa_4
0.00 0.000 0.000 0.000 0.000 1 Fixnum#to_s
* indicates recursively called methods
结论是ftoa_3
是关闭GC时最快的,但我建议使用ftoa_4
,因为它使用的内存更少,从而减少了GC的次数。如果您打开GC,您会发现ftoa_4
将是最快的。
从个人资料搜索结果中,您可以看到该计划在String#split
和ftoa_1
的{{1}}中花费的时间最多。 ftoa_2
是最差的,因为ftoa_1
多次运行(每行1次),String#split
也需要很长时间。