我正在尝试解析我的在线银行对帐单,检索这些值,然后获取各个值。这是一个示例声明。 otherrefcode
代表我发送的款项,refcode
代表我收到的款项。
Date Description Type [?] In (£) Out (£) Balance (£)
29 Aug 13 person1 otherrefcode 29AUG13 18:23 FPO 42.81 662.68
29 Aug 13 person2 otherrefcode 29AUG13 18:21 FPO 599.91 705.49
29 Aug 13 person3 refcode TFR 30.80 1,305.40
28 Aug 13 person4 otherrefcode 28AUG13 14:23 FPO 25.27 1,336.20
28 Aug 13 person5 refcode TFR 41.08 1,361.47
这是我的红宝石代码。如何获取单个值?
require 'watir-webdriver'
require 'nokogiri'
def toprice(data)
data.to_s.match(/\d\d\.\d\d/).to_s
end
$browser = Watir::Browser.new :firefox
$browser.goto("bankurl")
$page_html = Nokogiri::HTML.parse($browser.html)
table_array = Array.new
table = $browser.table(:class,'statement smartRewardsOffers')
table.rows.each do |row|
row_array = Array.new
row.cells.each do |cell|
row_array << cell.text
end
table_array << row_array
end
puts "1strun"
puts table_array[1..4][1]
puts "2ndrun"
puts table_array[1][1..4]
输出
1strun
person1 otherrefcode 29AUG13 18:23
FPO
42.81
2ndrun
29 Aug 13
person2 otherrefcode 29AUG13 18:21
FPO
599.91
705.49
声明的HTML(好吧,前3个交易 - 警告,76行长。)
<table id="pnlgrpStatement:conS1:tblTransactionListView" class="statement smartRewardsOffers" summary="Table displaying the statement for your account Classic xxxxxxxxx xxxxxxxxx">
<thead>
<tr>
<th class="{sorter:false} first" scope="col">
<form id="pnlgrpStatement:conS1:tblTransactionListView:frmToggle" class="validationName:(pnlgrpStatement:conS1:tblTransactionListView:frmToggle) validate:()" enctype="application/x-www-form-urlencoded" autocomplete="off" action="/personal/a/viewproductdetails/ViewProductDetails.jsp" method="post" name="pnlgrpStatement:conS1:tblTransactionListView:frmToggle">
<input id="pnlgrpStatement:conS1:tblTransactionListView:frmToggle:btnASCSortStatements" class="tableSorter tableSorterReverse" type="image" title="Sort by oldest first" alt="Sort by oldest first" src="/wps/wcm/connect/xxxxxxxxxxxx/sort_arrow_up-8-1375113571.png?MOD=AJPERES&CACHEID=xxxxxxxxxxx" name="pnlgrpStatement:conS1:tblTransactionListView:frmToggle:btnASCSortStatements">
Date
<input type="hidden" value="pnlgrpStatement:conS1:tblTransactionListView:frmToggle" name="pnlgrpStatement:conS1:tblTransactionListView:frmToggle">
<input type="hidden" value="xxxxxxx" name="submitToken">
<input type="hidden" name="hasJS" value="true">
</form>
</th>
<th class="{sorter:false} description" scope="col">Description</th>
<th class="{sorter:false} transactionType" scope="col">
Type
<span class="cxtHelp">
<a class="cxtTrigger" href="#transForView" title="Click to find out more about transaction types">[?]</a>
</span>
</th>
<th class="{sorter:false} numeric" scope="col">In (£)</th>
<th class="{sorter:false} numeric" scope="col">Out (£)</th>
<th class="{sorter:false} numeric" scope="col">Balance (£)</th>
</tr>
</thead>
<tbody>
<tr class="alt">
<th class="first">29 Aug 13</th>
<td>
<span class="splitString">person1</span>
<span class="splitString"> </span>
<span class="splitString">ref</span>
<span class="splitString"> </span>
<span class="splitString">29AUG13 18:23</span>
<span class="splitString"> </span>
</td>
<td>
<abbr title="Faster Payments Outgoing">FPO</abbr>
</td>
<td class="numeric"></td>
<td class="numeric">42.81</td>
<td class="numeric">662.68</td>
</tr>
<tr>
<th class="first">29 Aug 13</th>
<td>
<span class="splitString">person2</span>
<span class="splitString"> </span>
<span class="splitString">ref</span>
<span class="splitString"> </span>
<span class="splitString">29AUG13 18:21</span>
<span class="splitString"> </span>
</td>
<td>
<abbr title="Faster Payments Outgoing">FPO</abbr>
</td>
<td class="numeric"></td>
<td class="numeric">599.91</td>
<td class="numeric">705.49</td>
</tr>
<tr class="alt">
<th class="first">29 Aug 13</th>
<td>
<span class="splitString">person3</span>
<span class="splitString"> </span>
<span class="splitString">ref>
</td>
<td>
<abbr title="Transfer">TFR</abbr>
</td>
<td class="numeric"></td>
<td class="numeric">30.80</td>
<td class="numeric">1,305.40</td>
</tr>
</tbody>
</table>
答案 0 :(得分:0)
您已将每个单元格的文本输入table_array
。你只需要获得正确的细胞。它是一个2D数组,因此第一个索引是行,第二个索引是列。请注意,该数组是从0开始的索引(即0表示第一行/列)。
# type in the first row
puts table_array[1][2]
#=> "FPO"
# person in the first row
puts table_array[1][1].split[0]
#=> "person2"
# out value in the second row
puts table_array[2][4]
#=> "599.91"
使用这些指标并不是那么好。同样,此时拆分描述列更加困难。相反,我建议为每一行创建一个哈希值。
table_array = Array.new
table_rows = $browser.table(:class,'statement smartRewardsOffers')
table_rows.rows.to_a[1..-1].each do |row|
row_hash = Hash.new
row_hash[:date] = row.cell(:index => 0).text
row_hash[:person] = row.cell(:index => 1).span(:index => 0).text
row_hash[:code] = row.cell(:index => 1).span(:index => 2).text rescue ''
row_hash[:time] = row.cell(:index => 1).span(:index => 4).text rescue ''
row_hash[:type] = row.cell(:index => 2).text
row_hash[:in] = row.cell(:index => 3).text
row_hash[:out] = row.cell(:index => 4).text
row_hash[:balance] = row.cell(:index => 5).text
table_array << row_hash
end
# First data row's information
row = 0 # Note that the rows are 0-based index
puts table_array[row][:date] #=> "29 Aug 13"
puts table_array[row][:person] #=> "person1"
puts table_array[row][:code] #=> "ref"
puts table_array[row][:time] #=> "29AUG13 18:23"
puts table_array[row][:type] #=> "FPO"
puts table_array[row][:in] #=> ""
puts table_array[row][:out] #=> "42.81"
puts table_array[row][:balance] #=> "662.68"