我正在读取日志文件并尝试以以下格式组织数据,因此我想将NAME(即USOLA51,USOLA10 ..)作为哈希值并为 LIST 创建相应的数组和详细信息。 我也创建了哈希,但是不确定如何获取/提取相应/关联的数组值。
预期输出
NAME LIST DETAILS
USOLA51 ICC_ONUS .035400391
PA_ONUS .039800391
PA_ONUS .000610352
USOLA10 PAL 52.7266846
CFG_ONUS 15.9489746
likewise for the other values
日志文件:
--- data details ----
USOLA51
ONUS size
------------------------------ ----------
ICC_ONUS .035400391
PA_ONUS .039800391
PE_ONUS .000610352
=========================================
---- data details ----
USOLA10
ONUS size
------------------------------ ----------
PAL 52.7266846
CFG_ONUS 15.9489746
=========================================
---- data details ----
USOLA55
ONUS size
------------------------------ ----------
PA_ONUS 47.4707031
PAL 3.956604
ICC_ONUS .020385742
PE_ONUS .000610352
=========================================
---- data details ----
USOLA56
ONUS size
------------------------------ ----------
=========================================
我尝试过的
unique = Array.new
owner = Array.new
db = Array.new
File.read("mydb_size.log").each_line do |line|
next if line =~ /---- data details ----|^ONUS|---|=======/
unique << line.strip if line =~ /^U.*\d/
end
hash = Hash[unique.collect { |item| [item, ""] } ]
puts hash
当前O / p
{"USOLA51"=>"", "USOLA10"=>"", "USOLA55"=>"", "USOLA56"=>""}
在这里,任何前进的帮助都会很有帮助。谢谢!
答案 0 :(得分:2)
虽然您的日志文件不是CSV,但我发现csv库在许多非csv解析中很有用。您可以使用它来解析日志文件,方法是跳过空白行以及以 --- , === 或 ONUS 开头的任何行>。您的列分隔符是空格字符:
csv = CSV.read("./example.log", skip_lines: /\A(---|===|ONUS)/,
skip_blanks: true, col_sep: " ")
然后,某些行在数组中仅解析出1个元素,这些是您的标题行。因此,我们可以根据只有1个元素的时间将csv
数组分成几组,并根据结果创建一个哈希值:
output_hash = csv.slice_before { |row| row.length == 1 }.
each_with_object({}) do |((name), *rows), hash|
hash[name] = rows.to_h
end
现在,很难说出您是否希望将哈希输出作为显示的文本,或者只是想要哈希。如果要输出文本,我们首先需要查看每列需要显示多少空间:
name_length = output_hash.keys.max_by(&:length).length
list_length = output_hash.values.flat_map(&:keys).max_by(&:length).length
detail_length = output_hash.values.flat_map(&:values).max_by(&:length).length
format = "%-#{name_length}s %-#{list_length}s %-#{detail_length}s"
然后我们可以输出标题行和output_hash
中的所有值,但前提是它们具有任何值:
puts("#{format}\n\n" % ["NAME", "LIST", "DETAILS"])
output_hash.reject { |name, values| values.empty? }.each do |name, values|
list, detail = values.first
puts(format % [name, list, detail])
values.drop(1).each do |list, detail|
puts(format % ['', list, detail])
end
puts
end
和结果:
NAME LIST DETAILS
USOLA51 ICC_ONUS .035400391
PA_ONUS .039800391
PE_ONUS .000610352
USOLA10 PAL 52.7266846
CFG_ONUS 15.9489746
USOLA55 PA_ONUS 47.4707031
PAL 3.956604
ICC_ONUS .020385742
PE_ONUS .000610352
(对我来说)很难解释slice_before
的作用。但是,它需要一个数组(或其他可枚举的数组)并创建其元素的组或块,其中第一个元素与参数匹配,或者块返回true。例如,如果我们有一个较小的数组:
array = ["slice here", 1, 2, "slice here", 3, 4]
array.slice_before { |el| el == "slice here" }.entries
# => [["slice here", 1, 2], ["slice here", 3, 4]]
我们告诉slice_before
,我们希望每个组都以等于“ slice here”的元素开始,因此我们返回了2个组,每个组中的第一个元素为“ slice here”,其余元素都是数组中的元素,直到下一次看到“ slice here”。
因此,我们可以获取该结果,然后对其调用each_with_object
,并传递一个空哈希开始。使用each_with_object
,第一个参数将成为数组的元素(来自每个参数),第二个参数将成为您传递的对象。当块参数看起来像|((name), *rows), hash|
时会发生什么事情,就是第一个参数(数组的元素)被解构为数组的第一个元素和其余元素:
# the array here is what gets passed to `each_with_object` for the first iteration as the first parameter
name, *rows = [["USOLA51"], ["ICC_ONUS", ".035400391"], ["PA_ONUS", ".039800391"], ["PE_ONUS", ".000610352"]]
name # => ["USOLA51"]
rows # => [["ICC_ONUS", ".035400391"], ["PA_ONUS", ".039800391"], ["PE_ONUS", ".000610352"]]
然后,我们再次对第一个元素进行解构,只是这样就不会在数组中包含它:
name, * = name # the `, *` isn't needed in the block parameters, but is needed when you run these examples in irb
name # => "USOLA51"
对于max_by(&:length).length
,我们要做的就是找到数组中最长的元素(由keys
或values
返回)并获取其长度:
output_hash = {"USOLA51"=>{"ICC_ONUS"=>".035400391", "PA_ONUS"=>".039800391", "PE_ONUS"=>".000610352"}, "USOLA10"=>{"PAL"=>"52.7266846", "CFG_ONUS"=>"15.9489746"}, "USOLA55"=>{"PA_ONUS"=>"47.4707031", "PAL"=>"3.956604", "ICC_ONUS"=>".020385742", "PE_ONUS"=>".000610352"}, "USOLA56"=>{}}
output_hash.values.flat_map(&:keys)
# => ["ICC_ONUS", "PA_ONUS", "PE_ONUS", "PAL", "CFG_ONUS", "PA_ONUS", "PAL", "ICC_ONUS", "PE_ONUS"]
output_hash.values.map(&:length) # => [8, 7, 7, 3, 8, 7, 3, 8, 7]
output_hash.values.flat_map(&:keys).max_by(&:length) # => "ICC_ONUS"
output_hash.values.flat_map(&:keys).max_by(&:length).length # => 8
答案 1 :(得分:1)
我使用ruby已经很长时间了,所以可能我忘记了很多快捷方式和语法糖,但是这个文件似乎很容易解析,无需付出很大的努力。
简单的逐行比较期望值就足够了。第一步是删除所有周围的空格,忽略空白行或以=
或-
开头的行。接下来,如果只有一个值(即标题),那么下一行由列名组成,对于所需的输出,可以将其忽略。如果遇到标题或列名,请移至下一行并将以下键/值对另存为ruby键/值对。在此操作期间,还要检查出现的最长的字符串并调整列的填充,以便以后可以使用填充来生成类似表格的输出。
# Set up the loop
merged = []
current = -1
awaiting_headers = false
columns = ['NAME', 'LIST', 'DETAILS']
# Keep track of the max column length
columns_pad = columns.map { |c| c.length }
str.each_line do |line|
# Remove surrounding whitespaces,
# ignore empty or = - lines
line.strip!
next if line.empty?
next if ['-','='].include? line[0]
# Get the values of this line
parts = line.split ' '
# We're not awaiting the headers and
# there is just one value, must be the title
if not awaiting_headers and parts.size == 1
# If this string is longer than the current maximum
columns_pad[0] = line.length if line.length > columns_pad[0]
# Create a hash for this item
merged[current += 1] = {name: line, data: {}}
# Next must be the headers
awaiting_headers = true
next
end
# Headers encountered
if awaiting_headers
# Just skip it from here
awaiting_headers = false
next
end
# Take 2 parts of each (should be always only those two)
# and treat them as key/value
parts.each_cons(2) do |key, value|
# Make it a ruby key/value pair
merged[current][:data][key] = value
# Check if LIST or DETAILS column length needs to be raised
columns_pad[1] = key.length if key.length > columns_pad[1]
columns_pad[2] = value.length if value.length > columns_pad[2]
end
end
# Adding three spaces between columns
columns_pad.map! { |c| c + 3}
# Writing the headers
result = columns.map.with_index { |c, i| c.ljust(columns_pad[i]) }.join + "\n"
merged.each do |item|
# Remove the next line if you want to include empty data
next if item[:data].empty?
result += "\n"
result += item[:name].ljust(columns_pad[0])
# For the first value in data, we don't need extra padding or a line break
padding = ""
item[:data].each do |key, value|
result += padding
result += key.ljust(columns_pad[1])
result += value.ljust(columns_pad[2])
# Set the padding to include a line break and fill up the NAME column with spaces
padding = "\n" + "".ljust(columns_pad[0])
end
result += "\n"
end
puts result
这将导致
NAME LIST DETAILS
USOLA51 ICC_ONUS .035400391
PA_ONUS .039800391
PE_ONUS .000610352
USOLA10 PAL 52.7266846
CFG_ONUS 15.9489746
USOLA55 PA_ONUS 47.4707031
PAL 3.956604
ICC_ONUS .020385742
PE_ONUS .000610352