如何group_by这个哈希数组

时间:2017-09-30 23:30:35

标签: ruby-on-rails ruby csv enumerable

我已将CSV格式的数据从文件读入以下数组:

arr = [
["company", "location", "region", "service", "price", "duration", "disabled"], 
["Google", "Berlin", "EU", "Design with HTML/CSS", "120", "30", "false"], ["Google", "San Francisco", "US", "Design with HTML/CSS", "120", "30", "false"], 
["Google", "San Francisco", "US", "Restful API design", "1500", "120", "false"],
["IBM", "San Francisco", "US", "Design with HTML/CSS", "120", "30", "true"],
["Google<script>alert('hi')<script>", "Berlin", "EU", "Practical TDD", "300", "60", "false"],
["Œoogle", "San Francisco", "US", "Restful API design", "1500", "120", "false"],
["Apple", "Berlin", "EU", "Practical TDD", "300", "60", "true"],
["Apple", "London", "EU", "Advanced AngularJS", "1200", "180", "false"],
["Apple", "New York", "US", "Restful API design", "1500", "120", "false"]
]

我想在数据库中导入。基于下面提到的协会

# company.rb
  has_many :regions
  has_many :services

# region.rb
  has_many :branches
  belongs_to :company

# branch.rb
  belongs_to :region
  has_many :services

# service.rb
  belongs_to :company
  belongs_to :branch

可能是下面提到的哈希可以使用:(不确定。如果可能,请建议一个好的设计)

{"Google" : [ 
  :name => "Google",
  :regions_attributes => {
    :name => "US", 
    :locations_attributes => {
      :name => "San Francisco"
    }
  },
  :services_attributes: [{
    :name => "Restful API design",
    ...
  },
  {
    :name => "Design with HTML/CSS",
    ...
  }]
]}

我的尝试:

companies = []
CSV.foreach(csv_file, headers: true) do |row|
  company = {}
  company[:name]   = row['company']
  company[:regions_attributes] = {}
  company[:regions_attributes][:name] = row['region']
  company[:regions_attributes][:branches_attributes] = {}
  company[:regions_attributes][:branches_attributes][:name] = row['location']
  company[:services_attributes] = {}
  company[:services_attributes][:name] = row['service']
  company[:services_attributes][:price] = row['price']
  company[:services_attributes][:duration] = row['duration']
  company[:services_attributes][:disabled] = row['disabled']
  companies << company
end

companies.uniq! { |c| c.values }
companies = companies.group_by { |c| c[:name] }

按公司名称分组。

我想将上述示例中提到的一个地区的服务分组,美国Sanfrancisco有两个服务。

更新

根据Cary Swoveland的解决方案,我可以根据要求进行修改,但协会不能像我想象的那样工作。

companies = CSV.read(csv_file, headers: true).group_by {|csv| csv["company"]}
final = []
companies.transform_values do |arr1|
  company = Company.new(name: arr1.pluck("company").first.encode(Encoding.find('ASCII'), encoding_options))
  services = arr1.map do |c|
    { name: c['service'], price: c['price'], duration: c['duration'], disabled: c['disabled'] }
  end.uniq
  company.services.build(services)
  regions = arr1.group_by { |csv| csv["region"] }.transform_values do |arr2|
    branches = []
    branches << arr2.pluck('location').uniq.map { |location| { name: location, services_attributes: services } }
    { name: arr2.pluck('region').uniq.first, branches_attributes: branches.flatten }
  end
  company.regions.build(regions.values)
  final << company
end

Company.import(final, recursive: true) #activerecord-import gem

1 个答案:

答案 0 :(得分:2)

考虑更改哈希的结构并使用下面的代码构建它。文件 .scrollbar { min-height: 500px; max-height: 500px; background: #f1f1f1; overflow-y: hidden; } .scrollbar:hover { overflow-y : auto; } 包含csv文件的前20行左右,其链接由OP给出。我最后收录了它的内容。

'tmp.csv'

require 'csv'

CSV.read('tmp.csv', headers: true).group_by { |csv| csv["company"] }.
    transform_values do |arr1|
      arr1.group_by { |csv| csv["region"] }.
           transform_values do |arr2|
             arr2.group_by { |csv| csv["location"] }.
                  transform_values do |arr2|
                    arr2.map { |csv| csv["service"] }.uniq
                  end
           end
    end

如果这种哈希格式不适合(但内容是需要的),可以很容易地将其改为不同的格式。

请参阅CSV::readCSV::Row#[]Enumerable#group_byHash#transform_values的文档。

我被要求对链接的csv文件进行一些预处理。问题是公司名称前面有一个&#34; Byte Order Mark&#34;对于UTF-8文件(搜索&#34;好的,想出来&#34; here。)我使用Nathan Long给出的代码来删除这些字符。 OP必须在没有这些标记的情况下写入CSV文件,或者在读取文件时将其删除。

  #=> {"Google"=>{
         "EU"=>{
           "Berlin"=>["Design with HTML/CSS","Advanced AngularJS","Restful API design"],
           "London"=>["Restful API design"]
         },
         "US"=>{
            "San Francisco"=>["Design with HTML/CSS", "Restful API design"]
         }
       },
       "Apple"=>{
         "EU"=>{
           "London"=>["Design with HTML/CSS"],
           "Berlin"=>["Restful API design"]
         },
         "US"=>{
           "San Francisco"=>["Design with HTML/CSS"]
         }
       },
       "IBM"=>{
         "US"=>{
           "San Francisco"=>["Design with HTML/CSS"]
         },
         "EU"=>{
           "Berlin"=>["Restful API design"],
           "London"=>["Restful API design"]
         }
      }
     }