我有一个名为property_details_array
的数组,其中数组的每一行都是:
["\n \n \n \n 1 W Maple Dr,\n Atlanta,\n GA\n 30315"
我试图找出清理数据的最佳方法,以便将输出转换为类似于表格的格式(CSV或HTML)。
每个阵列都有超过200行,因此自动执行此操作非常有用。我开始解析数据,如:
property_details_array.each do |i|
prop_info = i.split("\n")
street = prop_info[4].strip
city = prop_info[5].strip
state = prop_info[6].strip
zip = prop_info[7].strip
end
然而,我有点坚持下一步去哪儿。我一直在考虑将它作为一个数组数组或一组哈希值,但我不确定一个是否比另一个更好,这取决于我将使用多少数据。这两种方法似乎都有意义,但是由于我不得不首先清理数据,所以我不确定最好的方法。
如何最好地输入这些值以供将来输出?
新信息
@tadman - 对于缺乏解释感到抱歉,非常感谢你的帮助。我的意思是,在仔细观察数据后,我意识到结果不是可预测的顺序。有时价格会在[16],有时则不是。我试图弄清楚如何将结果变成一组哈希值,然后回到绘图板。
我正在使用的原始数据如下:
["\n \n \n \n 2265 Tanglewood Cir NE,\n Atlanta,\n GA\n 30345\n \n \n\n \n Dresden East\n \n \n\n $289,900\n \n \n \n 3 bd\n 2 ba\n 1,566 sq ft\n
0.3 acres lot\n \n \n \n \n Single Family Home\n \n \n \n \n
Brokered by Re/Max Town And Country\n \n \n
\n \n \n Brokered by \n Re/Max
Town And Country\n \n \n \n ", "\n \n
\n \n 2141 Dunwoody Gln,\n
Atlanta,\n GA\n 30338\n \n \n\n
\n \n $469,900\n \n \n
\n 4 bd\n 3 ba\n 2,850 sq
ft\n 0.3 acres lot\n 2 car\n
\n \n \n \n Single Family Home\n
\n \n \n \n Brokered by
Buckhead Home Realty Llc\n \n \n \n
\n \n Brokered by \n Buckhead Home
Realty Llc\n \n \n \n ", "\n \n
\n \n 1048 Martin St SE,\n
Atlanta,\n GA\n 30315\n \n \n\n
\n Intown South\n Peoplestown\n \n \n
\n $164,900\n \n \n \n
5 bd\n 3 ba\n 2,376 sq ft\n
7,405 sq ft lot\n \n \n \n \n
Single Family Home\n \n \n \n \n
Brokered by Greenlet Llc\n \n \n \n
\n \n Brokered by \n Greenlet Llc\n
\n \n \n ", "\n \n \n \n
1048 Martin St SE,\n Atlanta,\n GA\n
30315\n \n \n\n \n Intown South\n
Peoplestown\n \n \n \n $164,900\n
\n \n \n 5 bd\n 3
ba\n 2,055 sq ft\n 7,584 sq ft lot\n
\n \n \n \n Single Family Home\n
\n \n \n \n Brokered by
Greenlet, Llc\n \n \n \n \n
\n Brokered by \n Greenlet, Llc\n \n
\n \n "]
理想情况下,我试图解析以下内容:(街道,城市,州,邮编,价格,BD,BA,Sq.FT)关于最佳方法的任何想法。
答案 0 :(得分:2)
如果字段处于可预测的顺序,为什么不将字段拆分成Hash?
你可以这样做:
FIELDS = [ nil, nil, nil, nil, :street, :city, :state, :zip ]
details.map do |d|
Hash[
FIELDS.zip(d.split("\n").map(&:strip)).select do |key, value|
key
end
]
end
# => [{:street=>"1 W Maple Dr,", :city=>"Atlanta,", :state=>"GA", :zip=>"30315"}]
构造一个哈希数组,每个哈希包含可以映射的任何字段。这里的优点是如果您的输入格式发生变化,如果字段重新排列,您的输出可以保持一致。
答案 1 :(得分:0)
这似乎是在哈希数组中存储的最佳选择,是最简单的解决方案。
property_details_array
.map { |row| row.split("\n") }
.map { |prop_info| {street: prop_info[4].strip,
city: prop_info[5].strip,
state: prop_info[6].strip,
zip: prop_info[7].strip} }