如何在运行循环条件下编写代码ruby来收集数据

时间:2014-01-20 08:50:50

标签: ruby regex

我在红宝石中退出了,我需要你的帮助。 现在我想编写ruby代码以在循环时收集一些数据。 我有2个代码用于这项工作。

我的目标是从输入文件中分割的文本中收集总和得分。

- 首先,运行test_dialog.rb - 第二,更改此格式的输入文件 从 AA:0.88:320:800 | BB:0.82:1040:1330 | CC:0.77:1330:1700 inquire-privilege_card 至 AA 0.88 BB 0.82 CC 0.77 - 然后使用每个单独检查对话框条件的文本。如果此数据出现在对话框中,则存储点直到文本结尾(AA - > BB - > CC)
- 最终得到平均分。

我有问题将分离并同时使用循环收集点。 请帮忙。

最好的关注。

PS。 如果与对话框匹配,则返回分数 输入线1的得分应为(0.88 + 0.82 + 0.77 / 3)[匹配条件1]。 如果没有比赛,则没有得分。

Input data
AA:0.88:320:800|BB:0.82:1040:1330|CC:0.77:1330:1700 enquire-privilege_card
BB:0.88:320:800|EE:0.82:1040:1330|FF:0.77:1330:1700 enquire-privilege_card
EE:0.88:320:800|QQ:0.82:1040:1330|AA:0.77:1330:1700|RR:0.77:1330:1700|TT:0.77:1330:1700 enquire-privilege_card

test_dialog.rb

#!/usr/bin/env ruby
# encoding: UTF-8
#
# Input file:
# hyp(with confidence score), ref_tag
#
# Output:
# hyp, ref_tag, hyp_tag, result
#

require_relative 'dialog'
require_relative 'version'

unless ARGV.length > 0
  puts 'Usage: ruby test_dialog.rb FILENAME [FILENAME2...]' 
  exit(1)
end

counter = Hash.new{|h,k| h[k]=Hash.new{|h2,k2| h2[k2]=Hash.new{|h3,k3| h3[k3]=0}}}
thresholds = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]

puts %w(hyp ref_tag hyp_tag result).join("\t")
ARGV.each do |fname|
  open(fname, 'r:UTF-8').each do |line|
    hyp, ref_tag = line.strip.split(/\t/)


    key = if ref_tag == "(reject)"
            :reject
          else
            :accept
          end
    counter[fname][key][:all] += 1
    thresholds.each do |threshold|
      hyp_all = get_response_text(hyp, threshold)


      hyp_tag = if hyp_all==:reject
                  "(reject)"
                else
                  hyp_all.split(/,/)[1]

                end


      result = ref_tag==hyp_tag
      counter[fname][key][threshold] += 1 if result
      puts [hyp.split('|').map{|t| t.split(':')[0]}.join(' '),
            ref_tag, hyp_tag, result].join("\t") if threshold==0.0
    end
  end
end

STDERR.puts ["Filename", "Result"].concat(thresholds).join("\t")
counter.each do |fname, c|
  ca_all = c[:accept].delete(:all)
  cr_all = c[:reject].delete(:all)

  ca = thresholds.map{|t| c[:accept][t]}.map{|n| ca_all==0 ? "N/A" : '%4.1f' % (n.to_f/ca_all*100) }
  cr = thresholds.map{|t| c[:reject][t]}.map{|n| cr_all==0 ? "N/A" : '%4.1f' % (n.to_f/cr_all*100) }

  STDERR.puts [fname, "Correct Accept"].concat(ca).join("\t")
  STDERR.puts [fname, "Correct Reject"].concat(cr).join("\t")
end

dialog.rb

# -*- coding: utf-8 -*-
#
# text : AA:0.88:320:800|BB:0.82:1040:1330|CC:0.77:1330:1700|DD:0.71:1700:2010|EE:1.00:2070:2390|FF:0.56:320:800|GG:0.12:1330:1700
#
def get_response_text text, threshold, dsr_session_id=nil
  # ...
  #p "result text >> " + text
  # Promotion => detail => rate
  # Promotion IR/IDD => high priority than enquire-promotion
  # Rate IR/IDD => high priority than enquire-rate
  # Problem IR/IDD => high priority than enquire-service_problem
  # Internet IR/IDD => high priority than enquire-internet
  # Cancel Net => enquire-internet NOT cancel-service
  # Lost-Stolen => +Broken
  memu = ""
  intent = ""
  prompt = ""
  intent_th = ""
  intent_id = ""

#  strInput = text.gsub(/\s/,'')
  strInput = text.split('|').map{|t| t.split(':')[0]}.join('')
puts ("****strINPUT*****")   
puts strInput


  scores = text.split('|').map{|t| t.split(':')[1].to_f}
puts ("****SCORE*****")   
puts scores


  avg_score = scores.inject(0){|a,x| a+=x} / scores.size
puts ("****AVG-Score*****")   
puts avg_score



  if avg_score < threshold
    return :reject
  end

  # List of Country 
  country_fname = File.dirname(__FILE__)+"/country_list.txt"
  country_list = open(country_fname, "r:UTF-8").readlines.map{|line| line.chomp}
  contry_reg = Regexp.union(country_list)

  # List of Mobile Type
  mobile_fname = File.dirname(__FILE__)+"/mobile_list.txt"
  mobile_list = open(mobile_fname, "r:UTF-8").readlines.map{|line| line.chomp}
  mobile_reg = Regexp.union(mobile_list)

  # List of Carrier
  carrier_fname = File.dirname(__FILE__)+"/carrier_list.txt"
  carrier_list = open(carrier_fname, "r:UTF-8").readlines.map{|line| line.chomp}
  carrier_reg = Regexp.union(carrier_list)


      if (strInput =~ /AA|BB/ and strInput =~ /CC/)  
      intent = "enquire-payment_method"
    elsif (strInput =~ /EE/) and ("#{$'}" =~ /QQ|RR/)
      intent = "enquire-balance_amount"
    elsif (strInput =~ /AA|EE/i) and (strInput =~ /TT/i)
      intent = "enquire-balance_unit"
    elsif (strInput =~ /DD|BB|/i) and (strInput =~ /FF|AA/i)
      intent = "service-balance_amount"

end

1 个答案:

答案 0 :(得分:0)

解析如下:

str = 'AA:0.88:320:800|BB:0.82:1040:1330|CC:0.77:1330:1700 enquire-privilege_card'
str.split( /[:|]/ ).select.with_index {| code, i | i % 4 < 2 ; }.join( ' ' )
# => "AA 0.88 BB 0.82 CC 0.77"